JOSS: https://github.com/adbar/htmldate
date date-parser datetime digital-forensics entity-extraction forensics-tools information-extraction metadata metadata-extraction natural-language-processing nlp opengraph web-scraping webscraping
Score: 23.647665264120654
Last synced: about 16 hours ago
JSON representation
Repository metadata:
Fast and robust date extraction from web pages, with Python or on the command-line
- Host: GitHub
- URL: https://github.com/adbar/htmldate
- Owner: adbar
- License: apache-2.0
- Created: 2017-08-24T14:09:12.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2025-11-04T17:50:33.000Z (3 months ago)
- Last Synced: 2026-02-02T10:58:19.815Z (4 days ago)
- Topics: date, date-parser, datetime, digital-forensics, entity-extraction, forensics-tools, information-extraction, metadata, metadata-extraction, natural-language-processing, nlp, opengraph, web-scraping, webscraping
- Language: Python
- Homepage: https://htmldate.readthedocs.io
- Size: 30.1 MB
- Stars: 145
- Watchers: 4
- Forks: 28
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
-
Funding:
- Github: adbar
- Ko fi: adbarbaresi
Owner metadata:
- Name: Adrien Barbaresi
- Login: adbar
- Email:
- Kind: user
- Description: Research scientist – natural language processing, web scraping and text analytics. Mostly with Python.
- Website: adrien.barbaresi.eu
- Location: Berlin
- Twitter: adbarbaresi
- Company: Berlin-Brg. Academy of Sciences (BBAW)
- Icon url: https://avatars.githubusercontent.com/u/2125866?u=e2eeeb3384ab9391f598ade37d4388ad23199ebd&v=4
- Repositories: 37
- Last Synced at: 2024-06-11T15:59:25.206Z
- Profile URL: https://github.com/adbar
GitHub Events
Total
- Create event: 13
- Delete event: 12
- Fork event: 2
- Issue comment event: 13
- Issues event: 8
- Pull request event: 23
- Push event: 21
- Release event: 2
- Watch event: 20
- Total: 114
Last Year
- Create event: 1
- Delete event: 1
- Fork event: 2
- Issue comment event: 4
- Issues event: 4
- Pull request event: 2
- Push event: 3
- Watch event: 15
- Total: 32
Committers metadata
Last synced: about 1 month ago
Total Commits: 682
Total Committers: 25
Avg Commits per committer: 27.28
Development Distribution Score (DDS): 0.252
Commits in past year: 3
Committers in past year: 2
Avg Commits per committer in past year: 1.5
Development Distribution Score (DDS) in past year: 0.333
| Name | Commits | |
|---|---|---|
| Adrien Barbaresi | b****i@b****e | 510 |
| Adrien Barbaresi | a****i@o****t | 65 |
| evolutionoftheuniverse | 6****e | 31 |
| DerKozmonaut | 5****t | 17 |
| Adrien Barbaresi | a****i@e****r | 15 |
| Corey Dockser | c****r@g****m | 9 |
| Radhi Fadlillah | m****f@g****m | 7 |
| dependabot[bot] | 4****] | 6 |
| Vincent Barbaresi | v****i@o****m | 2 |
| Daniel S. Katz | d****z@i****g | 2 |
| Rahul B | r****t@g****m | 2 |
| kernc | k****e@g****m | 2 |
| sourcery-ai[bot] | 5****] | 2 |
| Adam Hupp | 1****i | 1 |
| Andrei Zhemaituk | a****k@g****m | 1 |
| Ashik Paul | a****7@g****m | 1 |
| B3N | b****@a****t | 1 |
| EkaterineSheshelidze | 8****e | 1 |
| Felipe Hertzer | f****r@g****m | 1 |
| Lawrence M Stewart | g****a | 1 |
| MSK1582 | 6****2 | 1 |
| Nada Ayesh | n****0@s****s | 1 |
| SalihTalha | 4****a | 1 |
| lgtm-com[bot] | 4****] | 1 |
| liulinlin90 | l****0@g****m | 1 |
Issue and Pull Request metadata
Last synced: 3 months ago
Total issues: 60
Total pull requests: 155
Average time to close issues: 2 months
Average time to close pull requests: 3 days
Total issue authors: 34
Total pull request authors: 22
Average comments per issue: 2.25
Average comments per pull request: 1.32
Merged pull request: 100
Bot issues: 0
Bot pull requests: 53
Past year issues: 6
Past year pull requests: 17
Past year average time to close issues: 19 days
Past year average time to close pull requests: 3 days
Past year issue authors: 6
Past year pull request authors: 3
Past year average comments per issue: 0.17
Past year average comments per pull request: 1.24
Past year merged pull request: 12
Past year bot issues: 0
Past year bot pull requests: 4
Top Issue Authors
- adbar (25)
- RadhiFadlillah (2)
- rahulbot (2)
- wangyu190810 (1)
- geoffbacon (1)
- evolutionoftheuniverse (1)
- Kulratis (1)
- PetroffSky (1)
- frenzymadness (1)
- zhemaituk (1)
- masylum (1)
- alroythalus (1)
- Ismael-Hery (1)
- eupattaro89 (1)
- arcombe012 (1)
Top Pull Request Authors
- adbar (71)
- dependabot[bot] (43)
- sourcery-ai[bot] (9)
- DerKozmonaut (3)
- evolutionoftheuniverse (3)
- b3n4kh (2)
- danielskatz (2)
- nadasuhailAyesh12 (2)
- EkaterineSheshelidze (2)
- vbarbaresi (2)
- felipehertzer (2)
- mohmmadAyesh (2)
- kernc (2)
- zhemaituk (2)
- SalihTalha (1)
Top Issue Labels
- bug (12)
- enhancement (11)
- maintenance (7)
- question (6)
- up for grabs (6)
- good first issue (2)
- documentation (2)
Top Pull Request Labels
- dependencies (43)
- python (14)
- github_actions (4)
Package metadata
- Total packages: 1
-
Total downloads:
- pypi: 4,774,467 last-month
- Total docker downloads: 606
- Total dependent packages: 5
- Total dependent repositories: 50
- Total versions: 59
- Total maintainers: 1
pypi.org: htmldate
Fast and robust extraction of original and updated publication dates from URLs and web pages.
- Homepage: https://htmldate.readthedocs.io
- Documentation: https://htmldate.readthedocs.io/
- Licenses: Apache 2.0
- Latest release: 1.9.4 (published 3 months ago)
- Last Synced: 2026-01-31T19:29:13.460Z (6 days ago)
- Versions: 59
- Dependent Packages: 5
- Dependent Repositories: 50
- Downloads: 4,774,467 Last month
- Docker Downloads: 606
-
Rankings:
- Downloads: 0.463%
- Dependent packages count: 1.865%
- Dependent repos count: 2.081%
- Docker downloads count: 2.381%
- Average: 3.768%
- Stargazers count: 7.271%
- Forks count: 8.546%
- Maintainers (1)
Dependencies
- actions/checkout v3 composite
- github/codeql-action/analyze v2 composite
- github/codeql-action/autobuild v2 composite
- github/codeql-action/init v2 composite
- actions/cache v2 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- codecov/codecov-action v3 composite
- htmldate *
- sphinx >=7.2.6
- backports-datetime-fromisoformat *
- charset_normalizer *
- dateparser *
- lxml *
- python-dateutil *
- urllib3 *
- articleDateExtractor ==0.20 test
- date_guesser ==2.1.4 test
- goose3 ==3.1.17 test
- htmldate >=1.5.0 test
- news-please ==1.5.35 test
- newspaper3k ==0.2.8 test
- tabulate ==0.9.0 test