JOSS: https://github.com/ropensci/tokenizers
nlp peer-reviewed r r-package rstats text-mining tokenizer
Score: 19.997384374221063
Last synced: about 12 hours ago
JSON representation
Repository metadata:
Fast, Consistent Tokenization of Natural Language Text
- Host: GitHub
- URL: https://github.com/ropensci/tokenizers
- Owner: ropensci
- License: other
- Created: 2016-03-25T04:16:33.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2024-03-27T09:33:53.000Z (about 2 years ago)
- Last Synced: 2025-10-26T01:38:59.840Z (8 months ago)
- Topics: nlp, peer-reviewed, r, r-package, rstats, text-mining, tokenizer
- Language: R
- Homepage: https://docs.ropensci.org/tokenizers
- Size: 1.24 MB
- Stars: 186
- Watchers: 15
- Forks: 24
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE
Owner metadata:
- Name: rOpenSci
- Login: ropensci
- Email: info@ropensci.org
- Kind: organization
- Description:
- Website: https://ropensci.org/
- Location: Berkeley, CA
- Twitter: rOpenSci
- Company:
- Icon url: https://avatars.githubusercontent.com/u/1200269?v=4
- Repositories: 307
- Last Synced at: 2023-03-10T20:30:59.242Z
- Profile URL: https://github.com/ropensci
GitHub Events
Total
- Watch event: 3
- Total: 3
Last Year
- Watch event: 3
- Total: 3
Committers metadata
Last synced: 8 months ago
Total Commits: 192
Total Committers: 13
Avg Commits per committer: 14.769
Development Distribution Score (DDS): 0.182
Commits in past year: 0
Committers in past year: 0
Avg Commits per committer in past year: 0.0
Development Distribution Score (DDS) in past year: 0.0
| Name | Commits | |
|---|---|---|
| Lincoln Mullen | l****n@l****m | 157 |
| Oliver Keyes | i****s@g****m | 13 |
| Dmitriy Selivanov | s****y@g****m | 6 |
| jrnold | j****d@g****m | 4 |
| Kenneth Benoit | k****t@l****k | 4 |
| tcharlon | c****n@p****m | 1 |
| Maëlle Salmon | m****n@y****e | 1 |
| Karthik Ram | k****m@g****m | 1 |
| Julia Silge | j****e@g****m | 1 |
| Jeroen Ooms | j****s@g****m | 1 |
| Hideaki Hayashi | h****h@g****m | 1 |
| Emil Hvitfeldt | e****t@g****m | 1 |
| ChrisMuir | c****A@g****m | 1 |
Issue and Pull Request metadata
Last synced: 10 months ago
Total issues: 64
Total pull requests: 23
Average time to close issues: 6 months
Average time to close pull requests: 16 days
Total issue authors: 29
Total pull request authors: 12
Average comments per issue: 4.41
Average comments per pull request: 1.83
Merged pull request: 23
Bot issues: 0
Bot pull requests: 0
Past year issues: 0
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 0
Past year pull request authors: 0
Past year average comments per issue: 0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- lmullen (24)
- dselivanov (5)
- EmilHvitfeldt (2)
- randomgambit (2)
- maelle (2)
- fschaffner (2)
- alanault (2)
- juliasilge (2)
- Ironholds (2)
- hope-data-science (2)
- kbenoit (1)
- ekstroem (1)
- ablaette (1)
- kevinbsc (1)
- jeroen (1)
Top Pull Request Authors
- Ironholds (4)
- lmullen (4)
- kbenoit (4)
- dselivanov (3)
- maelle (1)
- EmilHvitfeldt (1)
- juliasilge (1)
- ChrisMuir (1)
- jrnold (1)
- hideaki (1)
- jeroen (1)
- karthik (1)
Top Issue Labels
- bug (1)
- help wanted (1)
Top Pull Request Labels
Package metadata
- Total packages: 2
-
Total downloads:
- cran: 56,299 last-month
- Total docker downloads: 142,616
- Total dependent packages: 20 (may contain duplicates)
- Total dependent repositories: 39 (may contain duplicates)
- Total versions: 17
- Total maintainers: 1
proxy.golang.org: github.com/ropensci/tokenizers
- Homepage:
- Documentation: https://pkg.go.dev/github.com/ropensci/tokenizers#section-documentation
- Licenses: other
- Latest release: v0.3.0 (published over 3 years ago)
- Last Synced: 2025-10-30T03:58:23.780Z (8 months ago)
- Versions: 8
- Dependent Packages: 0
- Dependent Repositories: 0
-
Rankings:
- Dependent packages count: 5.389%
- Average: 5.57%
- Dependent repos count: 5.75%
cran.r-project.org: tokenizers
Fast, Consistent Tokenization of Natural Language Text
- Homepage: https://docs.ropensci.org/tokenizers/
- Documentation: http://cran.r-project.org/web/packages/tokenizers/tokenizers.pdf
- Licenses: MIT + file LICENSE
- Latest release: 0.3.0 (published over 3 years ago)
- Last Synced: 2025-10-30T03:58:33.350Z (8 months ago)
- Versions: 9
- Dependent Packages: 20
- Dependent Repositories: 39
- Downloads: 56,299 Last month
- Docker Downloads: 142,616
-
Rankings:
- Downloads: 2.255%
- Stargazers count: 2.339%
- Forks count: 2.942%
- Dependent packages count: 3.298%
- Dependent repos count: 4.154%
- Average: 5.874%
- Docker downloads count: 20.256%
- Maintainers (1)
Dependencies
- R >= 3.1.3 depends
- Rcpp >= 0.12.3 imports
- SnowballC >= 0.5.1 imports
- stringi >= 1.0.1 imports
- covr * suggests
- knitr * suggests
- rmarkdown * suggests
- stopwords >= 0.9.0 suggests
- testthat * suggests
- rocker/shiny-verse 4.3.2 build