https://github.com/ml6team/fondant

Production-ready data processing made easy and shareable
https://github.com/ml6team/fondant

Keywords

data-processing fine-tuning foundation-models machine-learning pipeline python

Keywords from Contributors

transformers measur archiving animals observation conversion optimize compose generic language-model

Last synced: 11 months ago
JSON representation

Acceptance Criteria

Repository metadata

Production-ready data processing made easy and shareable


Owner metadata


Committers metadata

Last synced: over 1 year ago

Total Commits: 529
Total Committers: 24
Avg Commits per committer: 22.042
Development Distribution Score (DDS): 0.726

Commits in past year: 529
Committers in past year: 24
Avg Commits per committer in past year: 22.042
Development Distribution Score (DDS) in past year: 0.726

Name Email Commits
Philippe Moussalli p****5@g****m 145
Robbe Sneyders r****s@g****m 143
Niels Rogge n****e@N****l 69
Georges Lorrรฉ 3****e 57
Matthias Richter m****2@g****m 46
NielsRogge 4****e 26
Georges Lorre g****e@m****u 7
ChristiaensBert 9****t 7
hakiamri 1****9 6
Sharon Grundmann s****n@m****u 5
Till Wenke 3****e 3
SATISH J s****y@g****m 2
Sharon a****n@g****m 2
Shubham Krishna s****m@g****m 1
Alexander Remmerie 4****e 1
CarolineAdam 1****m 1
andres-vv 1****v 1
dependabot[bot] 4****] 1
jamesbraniganml6 1****6 1
janvanlooy 3****y 1
khaerensml6 9****6 1
RobinVC n****f@h****e 1
Philippe Moussalli p****e@m****i@m****u 1
Anush a****0@g****m 1

Committer domains:


Issue and Pull Request metadata

Last synced: 12 months ago

Total issues: 151
Total pull requests: 200
Average time to close issues: about 2 months
Average time to close pull requests: 4 days
Total issue authors: 13
Total pull request authors: 11
Average comments per issue: 1.38
Average comments per pull request: 0.86
Merged pull request: 160
Bot issues: 0
Bot pull requests: 0

Past year issues: 145
Past year pull requests: 200
Past year average time to close issues: about 1 month
Past year average time to close pull requests: 4 days
Past year issue authors: 12
Past year pull request authors: 11
Past year average comments per issue: 1.43
Past year average comments per pull request: 0.86
Past year merged pull request: 160
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/ml6team/fondant

Top Issue Authors

  • RobbeSneyders (58)
  • PhilippeMoussalli (29)
  • GeorgesLorre (23)
  • mrchtr (23)
  • NielsRogge (4)
  • picousse (3)
  • satishjasthi (3)
  • janvanlooyml6 (2)
  • philippe-ml6 (2)
  • andres-vv (1)
  • jBontinck (1)
  • jufeif (1)
  • KasperZutterman (1)

Top Pull Request Authors

  • PhilippeMoussalli (59)
  • mrchtr (53)
  • RobbeSneyders (47)
  • GeorgesLorre (21)
  • Hakimovich99 (8)
  • NielsRogge (6)
  • CarolineAdam (2)
  • shub-kris (1)
  • ChristiaensBert (1)
  • andres-vv (1)
  • Philmod (1)

Top Issue Labels

  • Core (26)
  • Components (21)
  • documentation (11)
  • CI/CD (7)
  • bug (7)
  • Testing (6)
  • Infrastructure (6)
  • Ease of use (6)
  • Data explorer (4)
  • enhancement (4)
  • Hub (1)
  • question (1)

Top Pull Request Labels

  • Core (1)
  • documentation (1)

Package metadata

pypi.org: fondant

Fondant - Large-scale data processing made easy and reusable

  • Homepage: https://github.com/ml6team/fondant
  • Documentation: https://fondant.readthedocs.io/
  • Licenses: Apache-2.0
  • Latest release: 1.0.0 (published over 1 year ago)
  • Last Synced: 2024-05-22T12:01:01.815Z (12 months ago)
  • Versions: 45
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 704 Last month
  • Rankings:
    • Stargazers count: 4.828%
    • Downloads: 7.097%
    • Forks count: 9.324%
    • Dependent packages count: 10.126%
    • Average: 10.582%
    • Dependent repos count: 21.535%
  • Maintainers (2)

Dependencies

.github/workflows/pipeline.yaml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
pyproject.toml pypi
  • dataclasses-json ^0.5.7
  • datasets ^2.10.1
  • kfp ^1.8.19
  • kubernetes ^18.20.0
  • pandas ^1.3.5
  • python ^3.8
.github/workflows/build.yaml actions
  • actions/checkout v3 composite
  • aws-actions/amazon-ecr-login v2 composite
  • aws-actions/configure-aws-credentials v1 composite
  • docker/login-action v2 composite
  • docker/setup-buildx-action v2 composite
.github/workflows/prep-release.yaml actions
  • actions/checkout master composite
  • actions/setup-python v1 composite
  • aws-actions/amazon-ecr-login v2 composite
  • aws-actions/configure-aws-credentials v1 composite
  • docker/login-action v2 composite
  • docker/setup-buildx-action v2 composite
  • pypa/gh-action-pypi-publish v1.8.6 composite
.github/workflows/release.yaml actions
  • actions/checkout master composite
  • actions/setup-python v1 composite
  • docker/login-action v2 composite
  • docker/setup-buildx-action v2 composite
  • pypa/gh-action-pypi-publish v1.8.6 composite
components/caption_images/Dockerfile docker
  • base latest build
  • pytorch/pytorch 2.0.1-cuda11.7-cudnn8-runtime build
components/chunk_text/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/crop_images/Dockerfile docker
  • python 3.8-slim build
components/download_images/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/embed_images/Dockerfile docker
  • pytorch/pytorch 2.0.1-cuda11.7-cudnn8-runtime build
components/embed_text/Dockerfile docker
  • base latest build
  • pytorch/pytorch 2.0.1-cuda11.7-cudnn8-runtime build
components/evaluate_ragas/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/extract_image_resolution/Dockerfile docker
  • python 3.8-slim build
components/filter_image_resolution/Dockerfile docker
  • python 3.8-slim build
components/filter_language/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/filter_text_length/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/generate_minhash/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/index_qdrant/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/index_weaviate/Dockerfile docker
  • python 3.8-slim build
components/load_from_csv/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/load_from_files/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/load_from_hf_hub/Dockerfile docker
  • python 3.8-slim build
components/load_from_parquet/Dockerfile docker
  • python 3.8-slim build
components/load_with_llamahub/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/normalize_text/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/resize_images/Dockerfile docker
  • python 3.8-slim build
components/retrieve_from_weaviate/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/retrieve_laion_by_embedding/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/retrieve_laion_by_prompt/Dockerfile docker
  • base latest build
  • python 3.8-slim build
components/segment_images/Dockerfile docker
  • pytorch/pytorch 2.0.1-cuda11.7-cudnn8-runtime build
components/write_to_hf_hub/Dockerfile docker
  • python 3.8-slim build
data_explorer/Dockerfile docker
  • python 3.8-slim build
tests/examples/example_component/Dockerfile docker
  • python 3.8-slim build
tests/integration_tests/sample_pipeline_test/components/dummy_component/Dockerfile docker
  • base latest build
  • python 3.8-slim build
tests/pipeline/examples/pipelines/valid_pipeline/example_1/first_component/Dockerfile docker
tests/pipeline/examples/pipelines/valid_pipeline/example_1/fourth_component/Dockerfile docker
tests/pipeline/examples/pipelines/valid_pipeline/example_1/second_component/Dockerfile docker
tests/pipeline/examples/pipelines/valid_pipeline/example_1/third_component/Dockerfile docker
components/caption_images/requirements.txt pypi
  • Pillow ==10.0.1
  • torch ==2.0.1
  • transformers ==4.29.2
components/caption_images/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/chunk_text/requirements.txt pypi
  • langchain ==0.0.329
components/chunk_text/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/crop_images/requirements.txt pypi
  • Pillow ==10.0.1
components/download_images/requirements.txt pypi
  • albumentations ==1.3.0
  • httpx ==0.24.1
  • opencv-python-headless >=4.5.5.62,<5
components/download_images/test_requirements.txt pypi
  • pytest ==7.4.0 test
  • respx ==0.20.2 test
components/download_images/tests/requirements.txt pypi
  • pytest ==7.4.0 test
  • respx ==0.20.2 test
components/embed_images/requirements.txt pypi
  • Pillow ==10.0.1
  • transformers ==4.28.0
components/embed_text/requirements.txt pypi
  • aleph_alpha_client ==3.5.1
  • cohere ==4.27
  • google-cloud-aiplatform ==1.34.0
  • langchain ==0.0.329
  • openai ==0.28.1
  • pandas ==1.5.0
  • retry ==0.9.2
  • sentence-transformers ==2.2.2
  • tiktoken ==0.5.1
components/embed_text/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/evaluate_ragas/requirements.txt pypi
  • ragas ==0.0.21
components/evaluate_ragas/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/extract_image_resolution/requirements.txt pypi
  • imagesize ==1.4.1
components/filter_image_resolution/requirements.txt pypi
components/filter_language/requirements.txt pypi
  • fasttext-wheel ==0.9.2
components/filter_language/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/filter_text_length/requirements.txt pypi
  • fasttext-wheel ==0.9.2
  • pyarrow >=7.0
components/filter_text_length/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/generate_minhash/requirements.txt pypi
  • datasketch ==1.5.9
  • nltk ==3.8.1
components/generate_minhash/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/index_qdrant/requirements.txt pypi
  • qdrant_client ==1.6.9
components/index_qdrant/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/index_weaviate/requirements.txt pypi
  • weaviate-client ==3.24.2
components/load_from_csv/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/load_from_files/requirements.txt pypi
  • dask ==2023.5.0
  • fsspec ==2023.6.0
  • pandas ==2.0.3
components/load_from_files/tests/requirements.txt pypi
  • pytest ==7.4.2 test
  • pytest-mock ==3.12.0 test
components/load_from_hf_hub/requirements.txt pypi
  • Pillow ==10.0.1
  • huggingface_hub ==0.14.1
components/load_from_parquet/requirements.txt pypi
components/load_with_llamahub/requirements.txt pypi
  • llama-index ==0.9.9
components/load_with_llamahub/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/normalize_text/requirements.txt pypi
  • ftfy ==6.1.1
components/normalize_text/tests/requirements.txt pypi
  • pytest ==7.4.0 test
components/resize_images/requirements.txt pypi
  • Pillow ==10.0.1
components/retrieve_from_weaviate/requirements.txt pypi
  • weaviate-client ==3.24.1
components/retrieve_from_weaviate/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/retrieve_laion_by_embedding/requirements.txt pypi
components/retrieve_laion_by_embedding/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/retrieve_laion_by_prompt/requirements.txt pypi
components/retrieve_laion_by_prompt/tests/requirements.txt pypi
  • pytest ==7.4.2 test
components/segment_images/requirements.txt pypi
  • Pillow ==10.0.1
  • torch ==2.0.1
  • transformers ==4.29.2
components/write_to_hf_hub/requirements.txt pypi
  • Pillow ==10.0.1
  • datasets ==2.10.1
  • huggingface_hub ==0.14.1
data_explorer/requirements.txt pypi
  • beautifulsoup4 ==4.12.2
  • fpdf ==1.7.2
  • graphviz ==0.20.1
  • matplotlib ==3.7.1
  • plotly ==5.15.0
  • st-pages ==0.4.5
  • streamlit ==1.28.2
  • streamlit-aggrid ==0.3.4
  • streamlit-extras ==0.3.5
tests/integration_tests/sample_pipeline_test/components/dummy_component/requirements.txt pypi
  • langchain ==0.0.329 test

Score: 15.668673639634413