An open API service for producing an overview of a list of open source projects.

awesome-llama: https://github.com/datajuicer/data-juicer

data data-analysis data-pipeline data-processing data-science data-visualization foundation-models instruction-tuning large-language-models llm llms multi-modal pre-training synthetic-data

Score: 21.114911054763752

Last synced: about 19 hours ago
JSON representation

Repository metadata:

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷


Owner metadata:


GitHub Events

Total
Last Year

Committers metadata

Last synced: 10 days ago

Total Commits: 558
Total Committers: 47
Avg Commits per committer: 11.872
Development Distribution Score (DDS): 0.72

Commits in past year: 189
Committers in past year: 29
Avg Commits per committer in past year: 6.517
Development Distribution Score (DDS) in past year: 0.73

Name Email Commits
Yilun Huang l****l@a****m 156
Daoyuan Chen 6****c 51
BeachWang 1****7@p****n 48
Cathy0908 3****8 41
MeiXin Chen 8****n 35
Ce Ge (戈策) g****e@f****m 35
zhijianma z****j@a****m 30
garyzhang99 4****9 17
Cyrus Zhang c****g@g****m 14
chenhesen h****s@a****m 12
Xuchen Pan 3****c 12
Qirui-jiao 1****o 12
Du Bin d****5@g****m 11
Yuhan Liu 3****x 10
co63oc c****c 10
Zhen Qin z****n@g****m 7
John Giorgi j****i@g****m 6
Xinyu Zhang 6****h 6
chenyushuo 2****6@q****m 5
kyotom 3****m 4
HunterLine 1****0@q****m 3
lingzhq 1****q 3
2108038773 1****3 2
JamieYu y****a@f****m 2
ShenQianli s****i@u****u 2
Yuexiang XIE y****x@a****m 2
Shurui Kou 1****2@q****m 2
weijie 3****o 1
simplaj 3****j 1
seanzhang-zhichen 7****n 1
and 17 more...

Issue and Pull Request metadata

Last synced: 5 months ago

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull request: 0
Bot issues: 0
Bot pull requests: 0

Past year issues: 0
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 0
Past year pull request authors: 0
Past year average comments per issue: 0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/datajuicer/data-juicer

Top Issue Authors

Top Pull Request Authors


Top Issue Labels

Top Pull Request Labels


Package metadata

pypi.org: py-data-juicer

Data Processing for and with Foundation Models.

  • Homepage:
  • Documentation: https://py-data-juicer.readthedocs.io/
  • Licenses: Apache-2.0
  • Latest release: 1.5.1 (published 3 months ago)
  • Last Synced: 2026-05-20T14:21:33.633Z (13 days ago)
  • Versions: 27
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 4,826 Last month
  • Rankings:
    • Stargazers count: 7.067%
    • Dependent packages count: 7.382%
    • Forks count: 11.988%
    • Downloads: 16.887%
    • Average: 22.447%
    • Dependent repos count: 68.91%
  • Maintainers (1)