https://github.com/hmunachi/nanodl

A Jax-based library for designing and training transformer models from scratch.
attention attention-mechanism deep-learning distributed-training flax gpt jax llama machine-learning mistral nlp transformer
Added: about 1 year ago - Last Synced: 11 months ago - Created: August 22, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits:
  • Committers:
  • Issues: 9
  • Pull Requests: 15
  • Owner: HMUNACHI
  • Stars: 256
  • Forks: 10
  • Packages: 1
  • Downloads: 103
https://github.com/alirezatheh/perke

A keyphrase extractor for Persian
data-mining data-processing information-retrieval keyphrase keyphrase-extraction keyphrase-extractor keyword keyword-extraction keyword-extractor machine-learning ml natural-language-processing nlp persian persian-language python text-mining text-processing unsupervised-learning
Added: over 1 year ago - Last Synced: 11 months ago - Created: February 03, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 87
  • Committers: 4
  • Issues: 0
  • Pull Requests: 4
  • Owner: AlirezaTheH
  • Stars: 68
  • Forks: 7
  • Packages: 1
  • Downloads: 77
https://github.com/openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINOโ„ข inference
bert classification compression deep-learning hawq mixed-precision-training mmdetection nlp object-detection onnx openvino pruning pytorch quantization quantization-aware-training semantic-segmentation sparsity tensorflow transformers
Added: over 1 year ago - Last Synced: 11 months ago - Created: May 13, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 1778
  • Committers: 67
  • Issues: 61
  • Pull Requests: 797
  • Owner: openvinotoolkit
  • Stars: 772
  • Forks: 197
  • Packages: 2
  • Downloads: 71,044
https://github.com/kensho-technologies/sequence_align

Efficient implementations of Needleman-Wunsch and other sequence alignment algorithms written in Rust with Python bindings via PyO3.
bioinformatics hirschberg natural-language-processing needleman-wunsch nlp pyo3 python rust sequence-alignment
Added: over 1 year ago - Last Synced: 11 months ago - Created: April 05, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 13
  • Committers: 2
  • Issues: 5
  • Pull Requests: 10
https://github.com/konbraphat51/animatedwordcloud

Animate a timelapse of word cloud
animation datascience natural-language-processing nlp video visualization wordcloud
Added: over 1 year ago - Last Synced: 11 months ago - Created: November 15, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 675
  • Committers: 4
  • Issues: 45
  • Pull Requests: 87
  • Owner: konbraphat51
  • Stars: 9
  • Forks: 0
  • Packages: 1
  • Downloads: 271
https://github.com/geobrain-ai/geogalactica

Code and datasets for paper "GeoGalactica: A Scientific Large Language Model in Geoscience"
geoscience llm nlp
Added: over 1 year ago - Last Synced: 11 months ago - Created: July 27, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 10
  • Committers: 1
  • Issues: 2
  • Pull Requests: 0
https://github.com/allenai/dolma

Data and tools for generating and inspecting OLMo pre-training data.
data-processing large-language-models llm machile-learning nlp
Added: over 1 year ago - Last Synced: 11 months ago - Created: June 20, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 233
  • Committers: 13
  • Issues: 87
  • Pull Requests: 135
  • Owner: allenai
  • Stars: 800
  • Forks: 78
  • Packages: 1
  • Downloads: 17,721
https://github.com/dpriskorn/odsc

Project that aims to sentenize all the open data of Riksdagen and other sources to create an easily linkable dataset of sentences that can be refered to from Wikidata lexemes and other resources
civic-tech entity-linking folketinget named-entity-recognition nlp part-of-speech-tagging riksdagen riksdagensoppnadata wikidata wikidata-lexemes
Added: over 1 year ago - Last Synced: 11 months ago - Created: November 20, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 136
  • Committers: 2
  • Issues: 30
  • Pull Requests: 11
  • Owner: dpriskorn
  • Stars: 0
  • Forks: 0
  • Packages: 0
https://github.com/knowledge-graph-hub/kg-microbe


anatomical-knowledge cell-shapes chebi chemicals data-modeling environments envo go knowledge-graph media metabolism microbiology named-entity-recognition ncbitaxonomy nlp oaklib phenotypes robot traits
Added: over 1 year ago - Last Synced: 11 months ago - Created: November 13, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Jupyter Notebook
  • Commits: 688
  • Committers: 8
  • Issues: 89
  • Pull Requests: 102
https://github.com/dongrixinyu/jionlp

ไธญๆ–‡ NLP ้ข„ๅค„็†ใ€่งฃๆžๅทฅๅ…ทๅŒ…๏ผŒๅ‡†็กฎใ€้ซ˜ๆ•ˆใ€ๆ˜“็”จ A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
apache2 chinese natural-language-processing ner nlp nlp-parse preprocessing python time-parse time-parsing
Added: over 1 year ago - Last Synced: 11 months ago - Created: March 13, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 489
  • Committers: 14
  • Issues: 261
  • Pull Requests: 32
  • Owner: dongrixinyu
  • Stars: 3054
  • Forks: 370
  • Packages: 4
  • Downloads: 2,815
https://github.com/liaad/pt-pump-up

Hub for the Portuguese language NLP Resources
natural-language-processing nlp nlp-datasets nlp-resources portuguese-language resources
Added: over 1 year ago - Last Synced: 11 months ago - Created: October 25, 2023

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: PHP
  • Commits: 97
  • Committers: 2
  • Issues: 13
  • Pull Requests: 13
  • Owner: LIAAD
  • Stars: 4
  • Forks: 0
  • Packages: 0
https://github.com/deepraj1729/tchatbot

A ChatBot framework to create customizable all purpose Chatbots using NLP, Tensorflow, Speech Recognition
artificial-intelligence chatbot-framework conda deep-learning framework git github machine-learning neural-networks nlp nltk numpy pip pypi python3 sklearn speech-recognition tensorflow virtual-environment
Added: over 1 year ago - Last Synced: 11 months ago - Created: June 22, 2020

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 32
  • Committers: 5
  • Issues: 1
  • Pull Requests: 11
  • Owner: deepraj1729
  • Stars: 20
  • Forks: 1
  • Packages: 1
  • Downloads: 20
https://github.com/mv1388/aitoolbox

PyTorch Model Training and Experiment Tracking Framework
deep-learning machine-learning nlp python pytorch research
Added: over 1 year ago - Last Synced: 11 months ago - Created: July 18, 2017

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 1952
  • Committers: 2
  • Issues: 0
  • Pull Requests: 100
  • Owner: mv1388
  • Stars: 2
  • Forks: 2
  • Packages: 1
  • Downloads: 95
https://github.com/allenai/smashed

SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata iterables, or simple lists of dictionaries.
dataset datasets dict huggingface in-context-learning mappers natural-language-processing nlp pipeline prefix prefix-tuning preprocessing prompting pytorch text torchdata transformer transformers
Added: over 1 year ago - Last Synced: 11 months ago - Created: July 21, 2022

  • Relevant topics? true
  • External users? true
  • Open source license? true
  • Active? true
  • Fork? false
  • Main Language: Python
  • Commits: 145
  • Committers: 6
  • Issues: 1
  • Pull Requests: 65
  • Owner: allenai
  • Stars: 30
  • Forks: 3
  • Packages: 1
  • Downloads: 11,380