An open API service for producing an overview of a list of open source projects.

awesome-llama: https://github.com/datajuicer/data-juicer

data data-analysis data-pipeline data-processing data-science data-visualization foundation-models instruction-tuning large-language-models llm llms multi-modal pre-training synthetic-data

Score: 19.759112760345214

Last synced: about 12 hours ago
JSON representation

Repository metadata:

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷


Owner metadata:


GitHub Events

Total
Last Year

Committers metadata

Last synced: about 2 months ago

Total Commits: 528
Total Committers: 40
Avg Commits per committer: 13.2
Development Distribution Score (DDS): 0.71

Commits in past year: 223
Committers in past year: 25
Avg Commits per committer in past year: 8.92
Development Distribution Score (DDS) in past year: 0.677

Name Email Commits
Yilun Huang l****l@a****m 153
Daoyuan Chen 6****c 49
BeachWang 1****7@p****n 48
Cathy0908 3****8 41
Ce Ge (戈策) g****e@f****m 35
cmgzn 8****n 30
zhijianma z****j@a****m 30
garyzhang99 4****9 17
Cyrus Zhang c****g@g****m 13
chenhesen h****s@a****m 12
Xuchen Pan 3****c 12
Qirui-jiao 1****o 12
Yuhan Liu 3****x 10
co63oc c****c 10
Zhen Qin z****n@g****m 7
Xinyu Zhang 6****h 6
chenyushuo 2****6@q****m 5
Du Bin d****5@g****m 4
kyotom 3****m 4
John Giorgi j****i@g****m 3
lingzhq 1****q 3
2108038773 1****3 2
JamieYu y****a@f****m 2
ShenQianli s****i@u****u 2
Yuexiang XIE y****x@a****m 2
Shurui Kou 1****2@q****m 2
weijie 3****o 1
simplaj 3****j 1
seanzhang-zhichen 7****n 1
ricksun2023 1****3 1
and 10 more...

Issue and Pull Request metadata

Last synced: 3 months ago

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull request: 0
Bot issues: 0
Bot pull requests: 0

Past year issues: 0
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 0
Past year pull request authors: 0
Past year average comments per issue: 0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/datajuicer/data-juicer

Top Issue Authors

Top Pull Request Authors


Top Issue Labels

Top Pull Request Labels


Package metadata

pypi.org: py-data-juicer

Data Processing for and with Foundation Models.

  • Homepage:
  • Documentation: https://py-data-juicer.readthedocs.io/
  • Licenses: Apache-2.0
  • Latest release: 1.5.0 (published about 1 month ago)
  • Last Synced: 2026-03-16T22:28:11.400Z (18 days ago)
  • Versions: 26
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,544 Last month
  • Rankings:
    • Stargazers count: 7.067%
    • Dependent packages count: 7.382%
    • Forks count: 11.988%
    • Downloads: 16.887%
    • Average: 22.447%
    • Dependent repos count: 68.91%
  • Maintainers (1)