https://github.com/iusca/bioloop

Scientific data management portal and pipeline application template
data-delivery data-management data-management-platform pipeline research science workflow workflow-automation
Added: about 1 year ago - Last Synced: 11 months ago - Created: June 05, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Vue
  • Commits:
  • Committers:
  • Issues: 54
  • Pull Requests: 46
  • Owner: IUSCA
  • Stars: 3
  • Forks: 1
  • Packages: 0
https://github.com/asyml/forte

Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
data-processing deep-learning information-retrieval machine-learning natural-language natural-language-processing pipeline python text-data
Added: over 1 year ago - Last Synced: 11 months ago - Created: August 09, 2019

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 1028
  • Committers: 53
  • Issues: 55
  • Pull Requests: 49
  • Owner: asyml
  • Stars: 236
  • Forks: 60
  • Packages: 1
  • Downloads: 173
https://github.com/zeroto521/my-data-toolkit

Face the engineering of data preprocessing.
accessor geopandas geopandas-accessor pandas pandas-accessor pandas-like-object-accessor pipeline register sklearn transformer
Added: over 1 year ago - Last Synced: 11 months ago - Created: June 02, 2021

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 1534
  • Committers: 4
  • Issues: 9
  • Pull Requests: 135
  • Owner: Zeroto521
  • Stars: 2
  • Forks: 0
  • Packages: 1
  • Downloads: 94
https://github.com/gibsramen/qadabra

Snakemake workflow for comparison of differential abundance ranks
bioinformatics differential-abundance machine-learning metagenomics microbiome pipeline snakemake workflow
Added: over 1 year ago - Last Synced: 11 months ago - Created: May 04, 2022

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 60
  • Committers: 3
  • Issues: 16
  • Pull Requests: 23
  • Owner: biocore
  • Stars: 13
  • Forks: 4
  • Packages: 1
  • Downloads: 8
https://github.com/tenox7/ttyplot

a realtime plotting utility for terminal/console with data input from stdin
chart cli cli-app command-line-tool commandline console console-tool cpu graph ping pipe pipeline plot realtime sar snmp snmp-network-throughput snmpget stdin
Added: over 1 year ago - Last Synced: 11 months ago - Created: October 11, 2018

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: C
  • Commits: 325
  • Committers: 18
  • Issues: 66
  • Pull Requests: 118
  • Owner: tenox7
  • Stars: 956
  • Forks: 42
  • Packages: 7
  • Downloads: 133
https://github.com/ml6team/fondant

Production-ready data processing made easy and shareable
data-processing fine-tuning foundation-models machine-learning pipeline python
Added: over 1 year ago - Last Synced: 11 months ago - Created: March 02, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 529
  • Committers: 24
  • Issues: 151
  • Pull Requests: 200
  • Owner: ml6team
  • Stars: 316
  • Forks: 24
  • Packages: 1
  • Downloads: 704
https://github.com/giacbrd/smartpipeline

A framework for rapid development of robust data pipelines following a simple design pattern
data-analysis data-analytics data-mining data-pipelines data-processing data-science dataops design-patterns etl machine-learning mlops pipeline pipeline-framework pipelines reproducibility task-queue workflow
Added: over 1 year ago - Last Synced: 11 months ago - Created: September 03, 2018

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 275
  • Committers: 3
  • Issues: 0
  • Pull Requests: 3
  • Owner: giacbrd
  • Stars: 22
  • Forks: 2
  • Packages: 1
  • Downloads: 56
https://github.com/zazuko/barnard59

An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.
data-integration data-pipeline data-processing etl json-ld linked-data pipeline rdf semantic-web
Added: over 1 year ago - Last Synced: 11 months ago - Created: October 25, 2018

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: JavaScript
  • Commits: 1159
  • Committers: 19
  • Issues: 147
  • Pull Requests: 282
  • Owner: zazuko
  • Stars: 21
  • Forks: 2
  • Packages: 14
  • Downloads: 100,567
https://github.com/numaproj/numaflow

Kubernetes-native platform to run massively parallel data/streaming jobs
data-processing hacktoberfest k8s kubernetes map-reduce pipeline stream-processing
Added: over 1 year ago - Last Synced: 11 months ago - Created: May 20, 2022

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Go
  • Commits: 917
  • Committers: 65
  • Issues: 220
  • Pull Requests: 228
  • Owner: numaproj
  • Stars: 913
  • Forks: 88
  • Packages: 4
  • Downloads: 64
https://github.com/n0rdy/pippin

Go library to create and manage data pipelines on your machine
async asynchronous data data-engineering data-pipeline data-processing go golang golang-library golang-package goroutines pipeline
Added: over 1 year ago - Last Synced: 11 months ago - Created: November 18, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Go
  • Commits: 20
  • Committers: 3
  • Issues: 0
  • Pull Requests: 5
  • Owner: n0rdy
  • Stars: 14
  • Forks: 0
  • Packages: 1
https://github.com/cihga39871/pipelines.jl

A lightweight and powerful Julia package for computational pipelines and workflows.
bioinformatics bioinformatics-pipeline computational-pipelines julia pipeline pipelines ruffus snakemake workflow workflows
Added: over 1 year ago - Last Synced: 11 months ago - Created: March 23, 2021

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Julia
  • Commits: 155
  • Committers: 2
  • Issues: 13
  • Pull Requests: 1
  • Owner: cihga39871
  • Stars: 45
  • Forks: 2
  • Packages: 1
https://github.com/tektoncd/catalog

Catalog of shared Tasks and Pipelines.
catalog hacktoberfest k8s pipeline re-useable task tekton
Added: over 1 year ago - Last Synced: 11 months ago - Created: April 25, 2019

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Shell
  • Commits: 1100
  • Committers: 218
  • Issues: 72
  • Pull Requests: 136
  • Owner: tektoncd
  • Stars: 646
  • Forks: 564
  • Packages: 1
https://github.com/biocore/qadabra

Snakemake workflow for comparison of differential abundance ranks
bioinformatics differential-abundance machine-learning metagenomics microbiome pipeline snakemake workflow
Added: over 1 year ago - Last Synced: 11 months ago - Created: May 04, 2022

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 96
  • Committers: 4
  • Issues: 41
  • Pull Requests: 51
  • Owner: biocore
  • Stars: 13
  • Forks: 4
  • Packages: 0
https://github.com/openomics/ervx

Endogenous Retrovirus Expression Pipeline for Human and Mouse for use with bulk RNA
endogenous-retrovirus-expression herv human mouse pipeline quality-control singularity snakemake workflow
Added: over 1 year ago - Last Synced: 11 months ago - Created: March 07, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 47
  • Committers: 2
  • Issues: 1
  • Pull Requests: 0
  • Owner: OpenOmics
  • Stars: 1
  • Forks: 1
  • Packages: 0
https://github.com/epigen/spilterlize_integrate

A Snakemake workflow to split, filter, normalize, integrate and select highly variable features of count matrices resulting from experiments with sequencing readout (e.g., RNA-seq, ATAC-seq, ChIP-seq, Methyl-seq, miRNA-seq,...) including diagnostic visualizations.
atac-seq batch-effect bioinformatics biomedical-data-science chip-seq count-matrix dimensionality-reduction integration ngs normalization pipeline rna-seq snakemake visualization workflow
Added: over 1 year ago - Last Synced: 11 months ago - Created: June 28, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 22
  • Committers: 2
  • Issues: 7
  • Pull Requests: 0
  • Owner: epigen
  • Stars: 8
  • Forks: 0
  • Packages: 0
https://github.com/nf-core/bactmap

A mapping-based pipeline for creating a phylogeny from bacterial whole genome sequences
bacteria bacterial bacterial-genome-analysis genomics mapping nextflow nf-core phylogeny pipeline tree workflow
Added: over 1 year ago - Last Synced: 11 months ago - Created: June 09, 2019

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Nextflow
  • Commits: 225
  • Committers: 8
  • Issues: 25
  • Pull Requests: 42
  • Owner: nf-core
  • Stars: 48
  • Forks: 27
  • Packages: 0
https://github.com/norwegianveterinaryinstitute/alppaca

A tooL for Prokaryotic Phylogeny And Clustering Analysis
clustering nextflow phylogeny pipeline
Added: over 1 year ago - Last Synced: 11 months ago - Created: August 14, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Nextflow
  • Commits: 525
  • Committers: 4
  • Issues: 84
  • Pull Requests: 84
https://github.com/Gabaldonlab/redundans

Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
assembled-contigs assembly bioinformatics closing contigs docker-image fasta gap genome-assembly genomics heterozygous mate-pairs paired-end pipeline polymorphic python scaffolding
Added: over 1 year ago - Last Synced: 11 months ago - Created: April 07, 2015

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: C++
  • Commits: 560
  • Committers: 12
  • Issues: 102
  • Pull Requests: 6
https://github.com/hackingmaterials/automatminer

An automatic engine for predicting materials properties.
machine-learning material-properties pipeline prediction
Added: over 1 year ago - Last Synced: 11 months ago - Created: May 10, 2018

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 1467
  • Committers: 12
  • Issues: 16
  • Pull Requests: 85
https://github.com/gibsramen/xebec

Snakemake pipeline for microbiome diversity effect size benchmarking
bioinformatics microbiome pipeline snakemake
Added: over 1 year ago - Last Synced: 11 months ago - Created: March 22, 2022

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 119
  • Committers: 2
  • Issues: 0
  • Pull Requests: 14
  • Owner: gibsramen
  • Stars: 5
  • Forks: 2
  • Packages: 1
  • Downloads: 51
https://github.com/bishop-laboratory/rlpipes

RLPipes: A standardized R-loop-mapping pipeline.
bioinformatics ngs pipeline r-loop snakemake
Added: over 1 year ago - Last Synced: 11 months ago - Created: June 03, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 246
  • Committers: 4
  • Issues: 70
  • Pull Requests: 21
https://github.com/allenai/smashed

SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata iterables, or simple lists of dictionaries.
dataset datasets dict huggingface in-context-learning mappers natural-language-processing nlp pipeline prefix prefix-tuning preprocessing prompting pytorch text torchdata transformer transformers
Added: over 1 year ago - Last Synced: 11 months ago - Created: July 21, 2022

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 145
  • Committers: 6
  • Issues: 1
  • Pull Requests: 65
  • Owner: allenai
  • Stars: 30
  • Forks: 3
  • Packages: 1
  • Downloads: 11,380