gpt-neox-main.zip

51.27MB

36 需要积分: 1

立即下载

资源介绍:

gpt-neox-main.zip

[![GitHub issues](https://img.shields.io/github/issues/EleutherAI/gpt-neox)](https://github.com/EleutherAI/gpt-neox/issues) [

](https://wandb.ai/eleutherai/neox) # GPT-NeoX This repository records [EleutherAI](https://www.eleuther.ai)'s library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's [Megatron Language Model](https://github.com/NVIDIA/Megatron-LM) and has been augmented with techniques from [DeepSpeed](https://www.deepspeed.ai) as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training. This library is in widespread use in [academic, industry, and government labs](https://github.com/EleutherAI/gpt-neox#adoption-and-publications), including by researchers at Oak Ridge National Lab, CarperAI, Stability AI, Together.ai, Korea University, Carnegie Mellon University, and the University of Tokyo among others. Uniquely among similar libraries GPT-NeoX supports a wide variety of systems and hardwares, including launching via Slurm, MPI, and the IBM Job Step Manager, and has been run at scale on [AWS](https://aws.amazon.com/), [CoreWeave](https://www.coreweave.com/), [ORNL Summit](https://www.olcf.ornl.gov/summit/), [ORNL Frontier](https://www.olcf.ornl.gov/frontier/), [LUMI](https://www.lumi-supercomputer.eu/), and others. **If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face `transformers` library instead which supports GPT-NeoX models.** ## Why GPT-NeoX? GPT-NeoX leverages many of the same features and technologies as the popular Megatron-DeepSpeed library but with substantially increased usability and novel optimizations. Major features include: * Distributed training with ZeRO and 3D parallelism * A wide variety of systems and hardwares, including launching via Slurm, MPI, and the IBM Job Step Manager, and has been run at scale on [AWS](https://aws.amazon.com/), [CoreWeave](https://www.coreweave.com/), Oak Ridge's [Summit](https://www.olcf.ornl.gov/summit/) and [Frontier](https://www.olcf.ornl.gov/frontier/), [Pacific Northwest National Laboratory](https://hpc.pnl.gov/index.shtml), Argonne's [Polaris](https://docs.alcf.anl.gov/polaris/data-science-workflows/applications/gpt-neox/), [LUMI](https://www.lumi-supercomputer.eu/), and more. * Cutting edge architectural innovations including rotary and alibi positional embeddings, parallel feedforward attention layers, and flash attention. * Predefined configurations for popular architectures including Pythia, PaLM, Falcon, and LLaMA 1 \& 2 * Curriculum Learning * Easy connections with the open source ecosystem, including Hugging Face's [tokenizers](https://github.com/huggingface/tokenizers) and [transformers](https://github.com/huggingface/transformers/) libraries, logging via [WandB](https://wandb.ai/site), and evaluation via our [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness). ## News **[8/10/2023]** We now support checkpointing with AWS S3! Activate with the `s3_path` config option (for more detail, see [the PR](https://github.com/EleutherAI/gpt-neox/pull/1010)) **[9/20/2023]** As of https://github.com/EleutherAI/gpt-neox/pull/1035, we have deprecated Flash Attention 0.x and 1.x, and migrated support to Flash Attention 2.x. We don't believe this will cause problems, but if you have a specific use-case that requires old flash support using the latest GPT-NeoX, please raise an issue. **[8/10/2023]** We have experimental support for LLaMA 2 and Flash Attention v2 supported in our [math-lm](https://github.com/EleutherAI/math-lm) project that will be upstreamed later this month. **[5/17/2023]** After fixing some miscellaneous bugs we now fully support bf16. **[4/11/2023]** We have upgraded our Flash Attention implementation to now support Alibi positional embeddings. **[3/9/2023]** We have released GPT-NeoX 2.0.0, an upgraded version built on the latest DeepSpeed which will be regularly synced with going forward. ## Versions Prior to 3/9/2023, GPT-NeoX relied on [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed), which was based on an old version of DeepSpeed (0.3.15). In order to migrate to the latest upstream DeepSpeed version while allowing users to access the old versions of GPT-NeoX and DeeperSpeed, we have introduced two versioned releases for both libraries: - Version 2.0 of [GPT-NeoX](https://github.com/EleutherAI/gpt-neox/releases/tag/v2.0) and [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed/releases/tag/v2.0) are the latest versions built on the latest DeepSpeed, and will be maintained going forward. - Version 1.0 of [GPT-NeoX](https://github.com/EleutherAI/gpt-neox/releases/tag/v1.0) and [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed/releases/tag/v1.0) maintain snapshots of the old stable versions that [GPT-NeoX-20B](https://arxiv.org/abs/2204.06745) and the [Pythia Suite](https://github.com/EleutherAI/pythia) were trained on. # Contents - [GPT-NeoX](#gpt-neox) * [Why GPT-NeoX?](#why-gpt-neox) * [News](#news) * [Versions](#versions) - [Contents](#contents) - [Quick Start](#quick-start) * [Environment and Dependencies](#environment-and-dependencies) + [Host Setup](#host-setup) + [Flash Attention](#flash-attention) + [Multi-Node Launching](#multi-node-launching) + [Containerized Setup](#containerized-setup) * [Usage](#usage) - [Configuration](#configuration) * [Mixture of Experts](#mixture-of-experts) - [Datasets](#datasets) * [Preconfigured Datasets](#preconfigured-datasets) * [Using Custom Data](#using-custom-data) - [Training and Finetuning](#training-and-finetuning) * [Pretrained Models](#pretrained-models) + [GPT-NeoX-20B](#gpt-neox-20b) + [Pythia](#pythia) + [Polyglot](#polyglot) - [Inference](#inference) - [Evaluation](#evaluation) - [Exporting to Hugging Face](#exporting-to-hugging-face) - [Monitoring](#monitoring) * [Weights and Biases](#weights-and-biases) * [TensorBoard](#tensorboard) - [Running on multi-node](#running-on-multi-node) - [Profiling](#profiling) - [Adoption and Publications](#adoption-and-publications) * [Publications](#publications) * [Models](#models) + [English LLMs](#english-llms) + [Non-English LLMs](#non-english-llms) + [Code Models](#code-models) + [Other Modalities](#other-modalities) - [Administrative Notes](#administrative-notes) * [Citing GPT-NeoX](#citing-gpt-neox) * [Contributing](#contributing) * [Licensing](#licensing) * [Acknowledgements](#acknowledgements) # Quick Start ## Environment and Dependencies ### Host Setup First make sure you are in an environment with Python 3.8 with an appropriate version of PyTorch 1.8 or later installed. **Note:** Some of the libraries that GPT-NeoX depends on have not been updated to be compatible with Python 3.10+. Python 3.9 appears to work, but this codebase has been developed and tested for Python 3.8. To install the remaining basic dependencies, run: ```bash pip install -r requirements/requirements.txt pip install -r requirements/requirements-wandb.txt # optional, if logging using WandB pip install -r requirements/requirements-tensorboard.txt # optional, if logging via tensorboard python ./megatron/fused_kernels/setup.py install # optional, if using fused kernels ``` from the repository root. > [!Warning] > Our codebase relies on [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed), our fork of the [DeepSpeed](https://github.com/microsoft/DeepSpeed) library with some added changes. We strongly recommend using Anaconda, a virtual machine, or some other form of environment isol

资源文件列表:

gpt-neox-main.zip 大约有292个文件

gpt-neox-main/
gpt-neox-main/.idea/
gpt-neox-main/.idea/.gitignore 50B
gpt-neox-main/.idea/gpt-neox-main.iml 567B
gpt-neox-main/.idea/inspectionProfiles/
gpt-neox-main/.idea/inspectionProfiles/profiles_settings.xml 174B
gpt-neox-main/.idea/misc.xml 292B
gpt-neox-main/.idea/modules.xml 285B
gpt-neox-main/.idea/workspace.xml 2.06KB
gpt-neox-main/gpt-neox-main/
gpt-neox-main/gpt-neox-main/.clang-format 4.4KB
gpt-neox-main/gpt-neox-main/.dockerignore 17B
gpt-neox-main/gpt-neox-main/.github/
gpt-neox-main/gpt-neox-main/.github/CODEOWNERS 19B
gpt-neox-main/gpt-neox-main/.github/ISSUE_TEMPLATE/
gpt-neox-main/gpt-neox-main/.github/ISSUE_TEMPLATE/bug_report.md 712B
gpt-neox-main/gpt-neox-main/.github/ISSUE_TEMPLATE/feature_request.md 608B
gpt-neox-main/gpt-neox-main/.github/workflows/
gpt-neox-main/gpt-neox-main/.github/workflows/coverity_scan.yml 1.96KB
gpt-neox-main/gpt-neox-main/.github/workflows/cpu_ci.yml 1017B
gpt-neox-main/gpt-neox-main/.github/workflows/cpu_ci_dispatch.yml 438B
gpt-neox-main/gpt-neox-main/.github/workflows/cpu_ci_on_pr.yml 425B
gpt-neox-main/gpt-neox-main/.github/workflows/docker_build.yml 1.16KB
gpt-neox-main/gpt-neox-main/.github/workflows/pull_request.yml 1.36KB
gpt-neox-main/gpt-neox-main/.gitignore 2.05KB
gpt-neox-main/gpt-neox-main/.pre-commit-config.yaml 1.35KB
gpt-neox-main/gpt-neox-main/CITATION.cff 2.02KB
gpt-neox-main/gpt-neox-main/ckpts/
gpt-neox-main/gpt-neox-main/ckpts/20B_tokenizer.json 2.11MB
gpt-neox-main/gpt-neox-main/configs/
gpt-neox-main/gpt-neox-main/configs/1-3B.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/125M-dmoe.yml 2.47KB
gpt-neox-main/gpt-neox-main/configs/125M-json.yml 1.69KB
gpt-neox-main/gpt-neox-main/configs/125M-moe.yml 2.47KB
gpt-neox-main/gpt-neox-main/configs/125M.yml 2.35KB
gpt-neox-main/gpt-neox-main/configs/125M_my.yml 2.35KB
gpt-neox-main/gpt-neox-main/configs/13B.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/175B.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/19M.yml 2.14KB
gpt-neox-main/gpt-neox-main/configs/2-7B.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/20B.yml 3KB
gpt-neox-main/gpt-neox-main/configs/350M.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/49M.yml 2.15KB
gpt-neox-main/gpt-neox-main/configs/6-7B.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/760M.yml 2.32KB
gpt-neox-main/gpt-neox-main/configs/800M.yml 1.93KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/small_tune.json 1.86KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/tune.json 1.83KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/tune_1-3B.json 2.01KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/tune_6-7B.json 1.69KB
gpt-neox-main/gpt-neox-main/configs/bf16_125M.yml 2.11KB
gpt-neox-main/gpt-neox-main/configs/bnb_125M.yml 2.17KB
gpt-neox-main/gpt-neox-main/configs/cpu_mock_config.yml 186B
gpt-neox-main/gpt-neox-main/configs/docker/
gpt-neox-main/gpt-neox-main/configs/docker/pythia-paths.yml 496B
gpt-neox-main/gpt-neox-main/configs/eleutherai_cluster.yml 1.1KB
gpt-neox-main/gpt-neox-main/configs/finetuning_configs/
gpt-neox-main/gpt-neox-main/configs/finetuning_configs/6-9B.yml 1.96KB
gpt-neox-main/gpt-neox-main/configs/gen_docs.py 3.14KB
gpt-neox-main/gpt-neox-main/configs/gmlp_small.yml 1.74KB
gpt-neox-main/gpt-neox-main/configs/llama/
gpt-neox-main/gpt-neox-main/configs/llama/13B.yml 628B
gpt-neox-main/gpt-neox-main/configs/llama/30B.yml 628B
gpt-neox-main/gpt-neox-main/configs/llama/65B.yml 628B
gpt-neox-main/gpt-neox-main/configs/llama/7B.yml 628B
gpt-neox-main/gpt-neox-main/configs/llama/README.md 678B
gpt-neox-main/gpt-neox-main/configs/llama/train_config.yml 1.58KB
gpt-neox-main/gpt-neox-main/configs/llama2/
gpt-neox-main/gpt-neox-main/configs/llama2/13B.yml 628B
gpt-neox-main/gpt-neox-main/configs/llama2/70B.yml 751B
gpt-neox-main/gpt-neox-main/configs/llama2/7B.yml 628B
gpt-neox-main/gpt-neox-main/configs/llama2/codellama_34B.yml 829B
gpt-neox-main/gpt-neox-main/configs/llama2/codellama_7B.yml 808B
gpt-neox-main/gpt-neox-main/configs/llemma/
gpt-neox-main/gpt-neox-main/configs/llemma/34B.yml 2.61KB
gpt-neox-main/gpt-neox-main/configs/llemma/7B.yml 2.51KB
gpt-neox-main/gpt-neox-main/configs/local_setup.yml 1.2KB
gpt-neox-main/gpt-neox-main/configs/mamba/
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-1.4B.yml 628B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-130M.yml 627B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-2.8B.yml 628B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-370M.yml 628B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-790M.yml 628B
gpt-neox-main/gpt-neox-main/configs/mistral/
gpt-neox-main/gpt-neox-main/configs/mistral/7B.yml 1.32KB
gpt-neox-main/gpt-neox-main/configs/neox_arguments.md 42.76KB
gpt-neox-main/gpt-neox-main/configs/pythia/
gpt-neox-main/gpt-neox-main/configs/pythia/1-4B.yml 1.79KB
gpt-neox-main/gpt-neox-main/configs/pythia/12B.yml 1.84KB
gpt-neox-main/gpt-neox-main/configs/pythia/14M.yml 2.26KB
gpt-neox-main/gpt-neox-main/configs/pythia/160M.yml 1.79KB
gpt-neox-main/gpt-neox-main/configs/pythia/1B.yml 1.84KB
gpt-neox-main/gpt-neox-main/configs/pythia/2-8B.yml 1.85KB
gpt-neox-main/gpt-neox-main/configs/pythia/31M.yml 2.25KB
gpt-neox-main/gpt-neox-main/configs/pythia/410M.yml 1.79KB
gpt-neox-main/gpt-neox-main/configs/pythia/6-9B.yml 1.82KB
gpt-neox-main/gpt-neox-main/configs/pythia/70M.yml 1.79KB
gpt-neox-main/gpt-neox-main/configs/README.md 12.15KB
gpt-neox-main/gpt-neox-main/configs/rwkv/
gpt-neox-main/gpt-neox-main/configs/rwkv/170M.yml 2.36KB
gpt-neox-main/gpt-neox-main/configs/slurm_125M.yml 1.63KB
gpt-neox-main/gpt-neox-main/configs/slurm_local.json 305B
gpt-neox-main/gpt-neox-main/configs/slurm_local.yml 356B
gpt-neox-main/gpt-neox-main/configs/sparse.yml 542B
gpt-neox-main/gpt-neox-main/configs/text_generation.yml 494B
gpt-neox-main/gpt-neox-main/CONTRIBUTING.md 4.62KB
gpt-neox-main/gpt-neox-main/data/
gpt-neox-main/gpt-neox-main/data/openwebtext2_sample.jsonl 125.51MB
gpt-neox-main/gpt-neox-main/deepy.py 1.31KB
gpt-neox-main/gpt-neox-main/docker-compose-dockerhub.yml 545B
gpt-neox-main/gpt-neox-main/docker-compose.yml 589B
gpt-neox-main/gpt-neox-main/Dockerfile 3.76KB
gpt-neox-main/gpt-neox-main/eval.py 2.6KB
gpt-neox-main/gpt-neox-main/eval_tasks/
gpt-neox-main/gpt-neox-main/eval_tasks/eval_adapter.py 19.82KB
gpt-neox-main/gpt-neox-main/eval_tasks/__init__.py 643B
gpt-neox-main/gpt-neox-main/generate.py 3.24KB
gpt-neox-main/gpt-neox-main/images/
gpt-neox-main/gpt-neox-main/images/memory_profiling.png 1.04MB
gpt-neox-main/gpt-neox-main/images/nsight_profiling.png 472.09KB
gpt-neox-main/gpt-neox-main/LICENSE 25.18KB
gpt-neox-main/gpt-neox-main/MANIFEST.in 65B
gpt-neox-main/gpt-neox-main/megatron/
gpt-neox-main/gpt-neox-main/megatron/checkpointing.py 17.14KB
gpt-neox-main/gpt-neox-main/megatron/data/
gpt-neox-main/gpt-neox-main/megatron/data/blendable_dataset.py 2.56KB
gpt-neox-main/gpt-neox-main/megatron/data/data_utils.py 17.63KB
gpt-neox-main/gpt-neox-main/megatron/data/gpt2_dataset.py 12.54KB
gpt-neox-main/gpt-neox-main/megatron/data/helpers.cpp 33.18KB
gpt-neox-main/gpt-neox-main/megatron/data/indexed_dataset.py 18.79KB
gpt-neox-main/gpt-neox-main/megatron/data/Makefile 279B
gpt-neox-main/gpt-neox-main/megatron/data/samplers.py 6.07KB
gpt-neox-main/gpt-neox-main/megatron/data/test.py 20B
gpt-neox-main/gpt-neox-main/megatron/data/__init__.py 16B
gpt-neox-main/gpt-neox-main/megatron/devutil.py 1.25KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/compat.h 893B
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/fused_rotary_positional_embedding.cpp 6.37KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/fused_rotary_positional_embedding.h 18.63KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/fused_rotary_positional_embedding_cuda.cu 15.36KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_masked_softmax.cpp 3.13KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_masked_softmax.h 23.44KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_masked_softmax_cuda.cu 4.55KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_upper_triang_masked_softmax.cpp 2.64KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h 26.3KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_upper_triang_masked_softmax_cuda.cu 3.37KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/setup.py 2.92KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/type_shim.h 21.61KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/__init__.py 5.86KB
gpt-neox-main/gpt-neox-main/megatron/gradient_noise_scale/
gpt-neox-main/gpt-neox-main/megatron/gradient_noise_scale/gradient_noise_scale.py 7.96KB
gpt-neox-main/gpt-neox-main/megatron/gradient_noise_scale/__init__.py 53B
gpt-neox-main/gpt-neox-main/megatron/initialize.py 8.38KB
gpt-neox-main/gpt-neox-main/megatron/learning_rates.py 5.1KB
gpt-neox-main/gpt-neox-main/megatron/logging.py 13.65KB
gpt-neox-main/gpt-neox-main/megatron/model/
gpt-neox-main/gpt-neox-main/megatron/model/activations.py 4.28KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_bias_dropout.py 1.83KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_layer_norm.py 4.77KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_rope.py 4.84KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_softmax.py 6.83KB
gpt-neox-main/gpt-neox-main/megatron/model/gmlp.py 4.97KB
gpt-neox-main/gpt-neox-main/megatron/model/gpt2_model.py 16.06KB
gpt-neox-main/gpt-neox-main/megatron/model/init_functions.py 7.49KB
gpt-neox-main/gpt-neox-main/megatron/model/mamba/
gpt-neox-main/gpt-neox-main/megatron/model/mamba/mamba.py 14.32KB
gpt-neox-main/gpt-neox-main/megatron/model/mamba/__init__.py 91B
gpt-neox-main/gpt-neox-main/megatron/model/megablocks_utils.py 896B
gpt-neox-main/gpt-neox-main/megatron/model/norms.py 2.89KB
gpt-neox-main/gpt-neox-main/megatron/model/positional_embeddings.py 9.93KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/cuda/
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/cuda/wkv6_cuda.cu 7.87KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/cuda/wkv6_op.cpp 2.5KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/rwkv.py 12.46KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/__init__.py 59B
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/__init__.py
gpt-neox-main/gpt-neox-main/megatron/model/transformer.py 49.76KB
gpt-neox-main/gpt-neox-main/megatron/model/utils.py 14.12KB
gpt-neox-main/gpt-neox-main/megatron/model/word_embeddings.py 9.4KB
gpt-neox-main/gpt-neox-main/megatron/model/__init__.py 894B
gpt-neox-main/gpt-neox-main/megatron/mpu/
gpt-neox-main/gpt-neox-main/megatron/mpu/cross_entropy.py 4.69KB
gpt-neox-main/gpt-neox-main/megatron/mpu/data.py 3.79KB
gpt-neox-main/gpt-neox-main/megatron/mpu/initialize.py 10.87KB
gpt-neox-main/gpt-neox-main/megatron/mpu/layers.py 27.37KB
gpt-neox-main/gpt-neox-main/megatron/mpu/mappings.py 4.83KB
gpt-neox-main/gpt-neox-main/megatron/mpu/random.py 1.53KB
gpt-neox-main/gpt-neox-main/megatron/mpu/utils.py 2.71KB
gpt-neox-main/gpt-neox-main/megatron/mpu/__init__.py 2.31KB
gpt-neox-main/gpt-neox-main/megatron/mup_substitute.py 7.62KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/arguments.py 54.58KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/deepspeed_args.py 11.86KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/neox_args.py 34.96KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/template.py 1.63KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/__init__.py 2.89KB
gpt-neox-main/gpt-neox-main/megatron/optimizers.py 17.69KB
gpt-neox-main/gpt-neox-main/megatron/text_generation_utils.py 33.38KB
gpt-neox-main/gpt-neox-main/megatron/tokenizer/
gpt-neox-main/gpt-neox-main/megatron/tokenizer/tokenizer.py 11.15KB
gpt-neox-main/gpt-neox-main/megatron/tokenizer/train_tokenizer.py 3.89KB
gpt-neox-main/gpt-neox-main/megatron/tokenizer/__init__.py 651B
gpt-neox-main/gpt-neox-main/megatron/training.py 42.11KB
gpt-neox-main/gpt-neox-main/megatron/utils.py 16.87KB
gpt-neox-main/gpt-neox-main/megatron/__init__.py 929B
gpt-neox-main/gpt-neox-main/prepare_data.py 2.28KB
gpt-neox-main/gpt-neox-main/preprocess_data.sh 310B
gpt-neox-main/gpt-neox-main/pretrain.sh 62B
gpt-neox-main/gpt-neox-main/README-MUP.md 1.53KB
gpt-neox-main/gpt-neox-main/README.md 52.62KB
gpt-neox-main/gpt-neox-main/requirements/
gpt-neox-main/gpt-neox-main/requirements/requirements-apex-pip.txt 12B
gpt-neox-main/gpt-neox-main/requirements/requirements-dev.txt 142B
gpt-neox-main/gpt-neox-main/requirements/requirements-flashattention.txt 18B
gpt-neox-main/gpt-neox-main/requirements/requirements-mamba.txt 104B
gpt-neox-main/gpt-neox-main/requirements/requirements-onebitadam.txt 20B
gpt-neox-main/gpt-neox-main/requirements/requirements-s3.txt 25B
gpt-neox-main/gpt-neox-main/requirements/requirements-sparseattention.txt 14B
gpt-neox-main/gpt-neox-main/requirements/requirements-tensorboard.txt 20B
gpt-neox-main/gpt-neox-main/requirements/requirements-wandb.txt 15B
gpt-neox-main/gpt-neox-main/requirements/requirements.txt 395B
gpt-neox-main/gpt-neox-main/tests/
gpt-neox-main/gpt-neox-main/tests/common.py 22.44KB
gpt-neox-main/gpt-neox-main/tests/config/
gpt-neox-main/gpt-neox-main/tests/config/test_setup.yml 1.97KB
gpt-neox-main/gpt-neox-main/tests/conftest.py 3.37KB
gpt-neox-main/gpt-neox-main/tests/cpu_tests/
gpt-neox-main/gpt-neox-main/tests/cpu_tests/action.yml 3.45KB
gpt-neox-main/gpt-neox-main/tests/cpu_tests/docker-compose.yml 506B
gpt-neox-main/gpt-neox-main/tests/data/
gpt-neox-main/gpt-neox-main/tests/data/enwik8_first100.txt 3.28KB
gpt-neox-main/gpt-neox-main/tests/data/hf_cache/
gpt-neox-main/gpt-neox-main/tests/data/hf_cache/tokenizer/
gpt-neox-main/gpt-neox-main/tests/data/hf_cache/tokenizer/gpt2.json 2.01MB
gpt-neox-main/gpt-neox-main/tests/data/sample_prompt.txt 28B
gpt-neox-main/gpt-neox-main/tests/model/
gpt-neox-main/gpt-neox-main/tests/model/test_fused_kernels.py 7.94KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_checkpoint.py 4.06KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_generation.py 3.78KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_instantiation.py 3.85KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_train.py 3.5KB
gpt-neox-main/gpt-neox-main/tests/model/__init__.py 579B
gpt-neox-main/gpt-neox-main/tests/neox_args/
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_commandline.py 5.52KB
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_implementation.py 914B
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_load.py 4.95KB
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_usage.py 2.61KB
gpt-neox-main/gpt-neox-main/tests/neox_args/__init__.py 89B
gpt-neox-main/gpt-neox-main/tests/pytest.ini 746B
gpt-neox-main/gpt-neox-main/tests/README.md 1.56KB
gpt-neox-main/gpt-neox-main/tests/test_configs/
gpt-neox-main/gpt-neox-main/tests/test_configs/test_train_base.yml 3.44KB
gpt-neox-main/gpt-neox-main/tests/unit/
gpt-neox-main/gpt-neox-main/tests/unit/test_arguments.py 1.53KB
gpt-neox-main/gpt-neox-main/tests/unit/test_dependencies.py 196B
gpt-neox-main/gpt-neox-main/tests/unit/test_format_conversion_scripts.py 930B
gpt-neox-main/gpt-neox-main/tests/unit/test_launcher_scripts.py 3.84KB
gpt-neox-main/gpt-neox-main/tests/unit/test_tokenizer.py 333B
gpt-neox-main/gpt-neox-main/tests/unit/test_url_accessibility.py 691B
gpt-neox-main/gpt-neox-main/tests/unit/__init__.py
gpt-neox-main/gpt-neox-main/tests/__init__.py
gpt-neox-main/gpt-neox-main/tools/
gpt-neox-main/gpt-neox-main/tools/bash/
gpt-neox-main/gpt-neox-main/tools/bash/kill.sh 16B
gpt-neox-main/gpt-neox-main/tools/bash/killall.sh 55B
gpt-neox-main/gpt-neox-main/tools/bash/README.md 512B
gpt-neox-main/gpt-neox-main/tools/bash/sync.sh 845B
gpt-neox-main/gpt-neox-main/tools/bash/syncdir.sh 905B
gpt-neox-main/gpt-neox-main/tools/bash/sync_cmd.sh 741B
gpt-neox-main/gpt-neox-main/tools/ckpts/
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_hf_to_sequential.py 22.29KB
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_neox_to_hf.py 26.27KB
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_neox_to_mamba_ssm.py 11.63KB
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_raw_llama_weights_to_neox.py 21.61KB
gpt-neox-main/gpt-neox-main/tools/ckpts/inspect_checkpoints.py 11.78KB
gpt-neox-main/gpt-neox-main/tools/ckpts/merge20b.py 9.23KB
gpt-neox-main/gpt-neox-main/tools/ckpts/README.md 5.29KB
gpt-neox-main/gpt-neox-main/tools/ckpts/upload.py 1.51KB
gpt-neox-main/gpt-neox-main/tools/datasets/
gpt-neox-main/gpt-neox-main/tools/datasets/corpora.py 10.54KB
gpt-neox-main/gpt-neox-main/tools/datasets/dataset_token_count.py 876B
gpt-neox-main/gpt-neox-main/tools/datasets/merge_datasets.py 2.26KB
gpt-neox-main/gpt-neox-main/tools/datasets/multinode_prepare_data.sh 2.26KB
gpt-neox-main/gpt-neox-main/tools/datasets/preprocess_data.py 7.56KB
gpt-neox-main/gpt-neox-main/tools/datasets/preprocess_data_with_mask.py 12.3KB
gpt-neox-main/gpt-neox-main/tools/datasets/README.md 5.48KB
gpt-neox-main/gpt-neox-main/tools/README.md 736B
gpt-neox-main/gpt-neox-main/tools/__init__.py
gpt-neox-main/gpt-neox-main/train.py 1.31KB