https://github.com/iusca/bioloop
Scientific data management portal and pipeline application template
data-delivery
data-management
data-management-platform
pipeline
research
science
workflow
workflow-automation
Added: about 1 year ago - Last Synced: 11 months ago
- Created: June 05, 2023

https://github.com/asyml/forte
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
data-processing
deep-learning
information-retrieval
machine-learning
natural-language
natural-language-processing
pipeline
python
text-data
Added: over 1 year ago - Last Synced: 11 months ago
- Created: August 09, 2019

https://github.com/zeroto521/my-data-toolkit
Face the engineering of data preprocessing.
accessor
geopandas
geopandas-accessor
pandas
pandas-accessor
pandas-like-object-accessor
pipeline
register
sklearn
transformer
Added: over 1 year ago - Last Synced: 11 months ago
- Created: June 02, 2021

https://github.com/gibsramen/qadabra
Snakemake workflow for comparison of differential abundance ranks
bioinformatics
differential-abundance
machine-learning
metagenomics
microbiome
pipeline
snakemake
workflow
Added: over 1 year ago - Last Synced: 11 months ago
- Created: May 04, 2022

https://github.com/tenox7/ttyplot
a realtime plotting utility for terminal/console with data input from stdin
chart
cli
cli-app
command-line-tool
commandline
console
console-tool
cpu
graph
ping
pipe
pipeline
plot
realtime
sar
snmp
snmp-network-throughput
snmpget
stdin
Added: over 1 year ago - Last Synced: 11 months ago
- Created: October 11, 2018

https://github.com/ml6team/fondant
Production-ready data processing made easy and shareable
data-processing
fine-tuning
foundation-models
machine-learning
pipeline
python
Added: over 1 year ago - Last Synced: 11 months ago
- Created: March 02, 2023

https://github.com/giacbrd/smartpipeline
A framework for rapid development of robust data pipelines following a simple design pattern
data-analysis
data-analytics
data-mining
data-pipelines
data-processing
data-science
dataops
design-patterns
etl
machine-learning
mlops
pipeline
pipeline-framework
pipelines
reproducibility
task-queue
workflow
Added: over 1 year ago - Last Synced: 11 months ago
- Created: September 03, 2018

https://github.com/zazuko/barnard59
An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.
data-integration
data-pipeline
data-processing
etl
json-ld
linked-data
pipeline
rdf
semantic-web
Added: over 1 year ago - Last Synced: 11 months ago
- Created: October 25, 2018
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: JavaScript
- Commits: 1159
- Committers: 19
- Issues: 147
- Pull Requests: 282
- Owner: zazuko
- Stars: 21
- Forks: 2
- Packages: 14
- Downloads: 100,567

https://github.com/numaproj/numaflow
Kubernetes-native platform to run massively parallel data/streaming jobs
data-processing
hacktoberfest
k8s
kubernetes
map-reduce
pipeline
stream-processing
Added: over 1 year ago - Last Synced: 11 months ago
- Created: May 20, 2022

https://github.com/n0rdy/pippin
Go library to create and manage data pipelines on your machine
async
asynchronous
data
data-engineering
data-pipeline
data-processing
go
golang
golang-library
golang-package
goroutines
pipeline
Added: over 1 year ago - Last Synced: 11 months ago
- Created: November 18, 2023

https://github.com/cihga39871/pipelines.jl
A lightweight and powerful Julia package for computational pipelines and workflows.
bioinformatics
bioinformatics-pipeline
computational-pipelines
julia
pipeline
pipelines
ruffus
snakemake
workflow
workflows
Added: over 1 year ago - Last Synced: 11 months ago
- Created: March 23, 2021
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: Julia
- Commits: 155
- Committers: 2
- Issues: 13
- Pull Requests: 1
- Owner: cihga39871
- Stars: 45
- Forks: 2
- Packages: 1

https://github.com/tektoncd/catalog
Catalog of shared Tasks and Pipelines.
catalog
hacktoberfest
k8s
pipeline
re-useable
task
tekton
Added: over 1 year ago - Last Synced: 11 months ago
- Created: April 25, 2019

https://github.com/biocore/qadabra
Snakemake workflow for comparison of differential abundance ranks
bioinformatics
differential-abundance
machine-learning
metagenomics
microbiome
pipeline
snakemake
workflow
Added: over 1 year ago - Last Synced: 11 months ago
- Created: May 04, 2022

https://github.com/openomics/ervx
Endogenous Retrovirus Expression Pipeline for Human and Mouse for use with bulk RNA
endogenous-retrovirus-expression
herv
human
mouse
pipeline
quality-control
singularity
snakemake
workflow
Added: over 1 year ago - Last Synced: 11 months ago
- Created: March 07, 2023

https://github.com/epigen/spilterlize_integrate
A Snakemake workflow to split, filter, normalize, integrate and select highly variable features of count matrices resulting from experiments with sequencing readout (e.g., RNA-seq, ATAC-seq, ChIP-seq, Methyl-seq, miRNA-seq,...) including diagnostic visualizations.
atac-seq
batch-effect
bioinformatics
biomedical-data-science
chip-seq
count-matrix
dimensionality-reduction
integration
ngs
normalization
pipeline
rna-seq
snakemake
visualization
workflow
Added: over 1 year ago - Last Synced: 11 months ago
- Created: June 28, 2023

https://github.com/nf-core/bactmap
A mapping-based pipeline for creating a phylogeny from bacterial whole genome sequences
bacteria
bacterial
bacterial-genome-analysis
genomics
mapping
nextflow
nf-core
phylogeny
pipeline
tree
workflow
Added: over 1 year ago - Last Synced: 11 months ago
- Created: June 09, 2019

https://github.com/norwegianveterinaryinstitute/alppaca
A tooL for Prokaryotic Phylogeny And Clustering Analysis
clustering
nextflow
phylogeny
pipeline
Added: over 1 year ago - Last Synced: 11 months ago
- Created: August 14, 2020
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: Nextflow
- Commits: 525
- Committers: 4
- Issues: 84
- Pull Requests: 84
- Owner: NorwegianVeterinaryInstitute
- Stars: 10
- Forks: 3
- Packages: 0

https://github.com/Gabaldonlab/redundans
Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
assembled-contigs
assembly
bioinformatics
closing
contigs
docker-image
fasta
gap
genome-assembly
genomics
heterozygous
mate-pairs
paired-end
pipeline
polymorphic
python
scaffolding
Added: over 1 year ago - Last Synced: 11 months ago
- Created: April 07, 2015
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: C++
- Commits: 560
- Committers: 12
- Issues: 102
- Pull Requests: 6
- Owner: Gabaldonlab
- Stars: 123
- Forks: 19
- Packages: 1

https://github.com/hackingmaterials/automatminer
An automatic engine for predicting materials properties.
machine-learning
material-properties
pipeline
prediction
Added: over 1 year ago - Last Synced: 11 months ago
- Created: May 10, 2018
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: Python
- Commits: 1467
- Committers: 12
- Issues: 16
- Pull Requests: 85
- Owner: hackingmaterials
- Stars: 134
- Forks: 48
- Packages: 1
- Downloads: 153

https://github.com/gibsramen/xebec
Snakemake pipeline for microbiome diversity effect size benchmarking
bioinformatics
microbiome
pipeline
snakemake
Added: over 1 year ago - Last Synced: 11 months ago
- Created: March 22, 2022

https://github.com/bishop-laboratory/rlpipes
RLPipes: A standardized R-loop-mapping pipeline.
bioinformatics
ngs
pipeline
r-loop
snakemake
Added: over 1 year ago - Last Synced: 11 months ago
- Created: June 03, 2020
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: Python
- Commits: 246
- Committers: 4
- Issues: 70
- Pull Requests: 21
- Owner: Bishop-Laboratory
- Stars: 3
- Forks: 1
- Packages: 1
- Downloads: 43

https://github.com/allenai/smashed
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata iterables, or simple lists of dictionaries.
dataset
datasets
dict
huggingface
in-context-learning
mappers
natural-language-processing
nlp
pipeline
prefix
prefix-tuning
preprocessing
prompting
pytorch
text
torchdata
transformer
transformers
Added: over 1 year ago - Last Synced: 11 months ago
- Created: July 21, 2022
