C/C++ 中的 LLM 推理

后端

18.45MB

68 需要积分: 1

立即下载

资源介绍:

描述主要目标llama.cpp是在各种硬件（本地和云端）上以最少的设置和最先进的性能实现 LLM 推理。纯 C/C++ 实现，无任何依赖项 Apple 芯片是一流的——通过 ARM NEON、Accelerate 和 Metal 框架进行了优化 AVX、AVX2 和 AVX512 支持 x86 架构 1.5 位、2 位、3 位、4 位、5 位、6 位和 8 位整数量化，可加快推理速度并减少内存使用用于在 NVIDIA GPU 上运行 LLM 的自定义 CUDA 内核（通过 HIP 支持 AMD GPU） Vulkan 和 SYCL 后端支持 CPU+GPU 混合推理，部分加速大于 VRAM 总容量的模型

# LLaMA.cpp HTTP Server Fast, lightweight, pure C/C++ HTTP server based on [httplib](https://github.com/yhirose/cpp-httplib), [nlohmann::json](https://github.com/nlohmann/json) and **llama.cpp**. Set of LLM REST APIs and a simple web front end to interact with llama.cpp. **Features:** * LLM inference of F16 and quantized models on GPU and CPU * [OpenAI API](https://github.com/openai/openai-openapi) compatible chat completions and embeddings routes * Parallel decoding with multi-user support * Continuous batching * Multimodal (wip) * Monitoring endpoints * Schema-constrained JSON response format The project is under active development, and we are [looking for feedback and contributors](https://github.com/ggerganov/llama.cpp/issues/4216). ## Usage ``` usage: ./llama-server [options] general: -h, --help, --usage print usage and exit --version show version and build info -v, --verbose print verbose information --verbosity N set specific verbosity level (default: 0) --verbose-prompt print a verbose prompt before generation (default: false) --no-display-prompt don't print prompt at generation (default: false) -co, --color colorise output to distinguish prompt and user input from generations (default: false) -s, --seed SEED RNG seed (default: -1, use random seed for < 0) -t, --threads N number of threads to use during generation (default: 8) -tb, --threads-batch N number of threads to use during batch and prompt processing (default: same as --threads) -td, --threads-draft N number of threads to use during generation (default: same as --threads) -tbd, --threads-batch-draft N number of threads to use during batch and prompt processing (default: same as --threads-draft) --draft N number of tokens to draft for speculative decoding (default: 5) -ps, --p-split N speculative decoding split probability (default: 0.1) -lcs, --lookup-cache-static FNAME path to static lookup cache to use for lookup decoding (not updated by generation) -lcd, --lookup-cache-dynamic FNAME path to dynamic lookup cache to use for lookup decoding (updated by generation) -c, --ctx-size N size of the prompt context (default: 0, 0 = loaded from model) -n, --predict N number of tokens to predict (default: -1, -1 = infinity, -2 = until context filled) -b, --batch-size N logical maximum batch size (default: 2048) -ub, --ubatch-size N physical maximum batch size (default: 512) --keep N number of tokens to keep from the initial prompt (default: 0, -1 = all) --chunks N max number of chunks to process (default: -1, -1 = all) -fa, --flash-attn enable Flash Attention (default: disabled) -p, --prompt PROMPT prompt to start generation with in conversation mode, this will be used as system prompt (default: '') -f, --file FNAME a file containing the prompt (default: none) --in-file FNAME an input file (repeat to specify multiple files) -bf, --binary-file FNAME binary file containing the prompt (default: none) -e, --escape process escapes sequences (\n, \r, \t, \', \", \\) (default: true) --no-escape do not process escape sequences -ptc, --print-token-count N print token count every N tokens (default: -1) --prompt-cache FNAME file to cache prompt state for faster startup (default: none) --prompt-cache-all if specified, saves user input and generations to cache as well not supported with --interactive or other interactive options --prompt-cache-ro if specified, uses the prompt cache but does not update it -r, --reverse-prompt PROMPT halt generation at PROMPT, return control in interactive mode can be specified more than once for multiple prompts -sp, --special special tokens output enabled (default: false) -cnv, --conversation run in conversation mode, does not print special tokens and suffix/prefix if suffix/prefix are not specified, default chat template will be used (default: false) -i, --interactive run in interactive mode (default: false) -if, --interactive-first run in interactive mode and wait for input right away (default: false) -mli, --multiline-input allows you to write or paste multiple lines without ending each in '\' --in-prefix-bos prefix BOS to user inputs, preceding the `--in-prefix` string --in-prefix STRING string to prefix user inputs with (default: empty) --in-suffix STRING string to suffix after user inputs with (default: empty) --spm-infill use Suffix/Prefix/Middle pattern for infill (instead of Prefix/Suffix/Middle) as some models prefer this. (default: disabled) sampling: --samplers SAMPLERS samplers that will be used for generation in the order, separated by ';' (default: top_k;tfs_z;typical_p;top_p;min_p;temperature) --sampling-seq SEQUENCE simplified sequence for samplers that will be used (default: kfypmt) --ignore-eos ignore end of stream token and continue generating (implies --logit-bias EOS-inf) --penalize-nl penalize newline tokens (default: false) --temp N temperature (default: 0.8) --top-k N top-k sampling (default: 40, 0 = disabled) --top-p N top-p sampling (default: 0.9, 1.0 = disabled) --min-p N min-p sampling (default: 0.1, 0.0 = disabled) --tfs N tail free sampling, parameter z (default: 1.0, 1.0 = disabled) --typical N locally typical sampling, parameter p (default: 1.0, 1.0 = disabled) --repeat-last-n N last n tokens to consider for penalize (default: 64, 0 = disabled, -1 = ctx_size) --repeat-penalty N penalize repeat sequence of tokens (default: 1.0, 1.0 = disabled) --presence-penalty N repeat alpha presence penalty (default: 0.0, 0.0 = disabled) --frequency-penalty N repeat alpha frequency penalty (default: 0.0, 0.0 = disabled) --dynatemp-range N dynamic temperature range (default: 0.0, 0.0 = disabled) --dynatemp-exp N dynamic temperature exponent (default: 1.0) --mirostat N use Mirostat sampling. Top K, Nucleus, Tail Free and Locally Typical samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0) --mirostat-lr N Mirostat learning rate, parameter eta (default: 0.1) --mirostat-ent N Mirostat target entropy, parameter tau (default: 5.0) -l TOKEN_ID(+/-)BIAS modifies the likelihood of token appearing in the completion, i.e. `--logit-bias 15043+1` to increase likelihood of token ' Hello', or `--logit-bias 15043-1` to decrease likelihood of token ' Hello' --cfg-negative-prompt PROMPT negative prompt to use for guidance (default: '') --cfg-negative-prompt-file FNAME negative prompt file to use for guidance --cfg-scale N strength of guidance (default: 1.0, 1.0 = disable)

资源文件列表:

llama.cpp-master.zip 大约有1096个文件

llama.cpp-master/
llama.cpp-master/.clang-tidy 791B
llama.cpp-master/.devops/
llama.cpp-master/.devops/cloud-v-pipeline 1.05KB
llama.cpp-master/.devops/full-cuda.Dockerfile 848B
llama.cpp-master/.devops/full-rocm.Dockerfile 1.1KB
llama.cpp-master/.devops/full.Dockerfile 470B
llama.cpp-master/.devops/llama-cli-cuda.Dockerfile 864B
llama.cpp-master/.devops/llama-cli-intel.Dockerfile 740B
llama.cpp-master/.devops/llama-cli-rocm.Dockerfile 1.01KB
llama.cpp-master/.devops/llama-cli-vulkan.Dockerfile 724B
llama.cpp-master/.devops/llama-cli.Dockerfile 376B
llama.cpp-master/.devops/llama-cpp-cuda.srpm.spec 2.56KB
llama.cpp-master/.devops/llama-cpp.srpm.spec 2.63KB
llama.cpp-master/.devops/llama-server-cuda.Dockerfile 1020B
llama.cpp-master/.devops/llama-server-intel.Dockerfile 900B
llama.cpp-master/.devops/llama-server-rocm.Dockerfile 1.18KB
llama.cpp-master/.devops/llama-server-vulkan.Dockerfile 843B
llama.cpp-master/.devops/llama-server.Dockerfile 519B
llama.cpp-master/.devops/nix/
llama.cpp-master/.devops/nix/apps.nix 434B
llama.cpp-master/.devops/nix/devshells.nix 279B
llama.cpp-master/.devops/nix/docker.nix 850B
llama.cpp-master/.devops/nix/jetson-support.nix 1.05KB
llama.cpp-master/.devops/nix/nixpkgs-instances.nix 1.67KB
llama.cpp-master/.devops/nix/package.nix 9.74KB
llama.cpp-master/.devops/nix/scope.nix 514B
llama.cpp-master/.devops/nix/sif.nix 729B
llama.cpp-master/.devops/tools.sh 1.67KB
llama.cpp-master/.dockerignore 158B
llama.cpp-master/.ecrc 80B
llama.cpp-master/.editorconfig 599B
llama.cpp-master/.flake8 544B
llama.cpp-master/.github/
llama.cpp-master/.github/ISSUE_TEMPLATE/
llama.cpp-master/.github/ISSUE_TEMPLATE/01-bug-low.yml 1.7KB
llama.cpp-master/.github/ISSUE_TEMPLATE/02-bug-medium.yml 1.71KB
llama.cpp-master/.github/ISSUE_TEMPLATE/03-bug-high.yml 1.72KB
llama.cpp-master/.github/ISSUE_TEMPLATE/04-bug-critical.yml 1.7KB
llama.cpp-master/.github/ISSUE_TEMPLATE/05-enhancement.yml 2.35KB
llama.cpp-master/.github/ISSUE_TEMPLATE/06-research.yml 1.69KB
llama.cpp-master/.github/ISSUE_TEMPLATE/07-refactor.yml 1.2KB
llama.cpp-master/.github/ISSUE_TEMPLATE/config.yml 524B
llama.cpp-master/.github/labeler.yml 2.32KB
llama.cpp-master/.github/pull_request_template.md 193B
llama.cpp-master/.github/workflows/
llama.cpp-master/.github/workflows/bench.yml 10.42KB
llama.cpp-master/.github/workflows/build.yml 42.25KB
llama.cpp-master/.github/workflows/close-issue.yml 717B
llama.cpp-master/.github/workflows/docker.yml 5.02KB
llama.cpp-master/.github/workflows/editorconfig.yml 607B
llama.cpp-master/.github/workflows/gguf-publish.yml 1.2KB
llama.cpp-master/.github/workflows/labeler.yml 355B
llama.cpp-master/.github/workflows/nix-ci-aarch64.yml 2.23KB
llama.cpp-master/.github/workflows/nix-ci.yml 2.51KB
llama.cpp-master/.github/workflows/nix-flake-update.yml 607B
llama.cpp-master/.github/workflows/nix-publish-flake.yml 1.11KB
llama.cpp-master/.github/workflows/python-check-requirements.yml 966B
llama.cpp-master/.github/workflows/python-lint.yml 561B
llama.cpp-master/.github/workflows/python-type-check.yml 995B
llama.cpp-master/.github/workflows/server.yml 6KB
llama.cpp-master/.gitignore 1.45KB
llama.cpp-master/.gitmodules 94B
llama.cpp-master/.pre-commit-config.yaml 447B
llama.cpp-master/AUTHORS 32.93KB
llama.cpp-master/CMakeLists.txt 6.29KB
llama.cpp-master/CMakePresets.json 2.82KB
llama.cpp-master/CONTRIBUTING.md 2.2KB
llama.cpp-master/LICENSE 1.05KB
llama.cpp-master/Makefile 48.57KB
llama.cpp-master/Package.swift 2KB
llama.cpp-master/README.md 28.74KB
llama.cpp-master/SECURITY.md 4.97KB
llama.cpp-master/ci/
llama.cpp-master/ci/README.md 1.06KB
llama.cpp-master/ci/run.sh 37.51KB
llama.cpp-master/cmake/
llama.cpp-master/cmake/arm64-windows-llvm.cmake 592B
llama.cpp-master/cmake/arm64-windows-msvc.cmake 192B
llama.cpp-master/cmake/build-info.cmake 1.57KB
llama.cpp-master/cmake/git-vars.cmake 717B
llama.cpp-master/cmake/llama-config.cmake.in 2.39KB
llama.cpp-master/cmake/llama.pc.in 250B
llama.cpp-master/common/
llama.cpp-master/common/CMakeLists.txt 2.74KB
llama.cpp-master/common/base64.hpp 12.58KB
llama.cpp-master/common/build-info.cpp.in 186B
llama.cpp-master/common/cmake/
llama.cpp-master/common/cmake/build-info-gen-cpp.cmake 943B
llama.cpp-master/common/common.cpp 135.3KB
llama.cpp-master/common/common.h 19.7KB
llama.cpp-master/common/console.cpp 15.86KB
llama.cpp-master/common/console.h 359B
llama.cpp-master/common/grammar-parser.cpp 21.75KB
llama.cpp-master/common/grammar-parser.h 874B
llama.cpp-master/common/json-schema-to-grammar.cpp 42.49KB
llama.cpp-master/common/json-schema-to-grammar.h 211B
llama.cpp-master/common/json.hpp 898.69KB
llama.cpp-master/common/log.h 24.09KB
llama.cpp-master/common/ngram-cache.cpp 11.08KB
llama.cpp-master/common/ngram-cache.h 3.98KB
llama.cpp-master/common/sampling.cpp 17.68KB
llama.cpp-master/common/sampling.h 6.33KB
llama.cpp-master/common/stb_image.h 313.42KB
llama.cpp-master/common/train.cpp 64.78KB
llama.cpp-master/common/train.h 7.7KB
llama.cpp-master/convert_hf_to_gguf.py 169.04KB
llama.cpp-master/convert_hf_to_gguf_update.py 14.43KB
llama.cpp-master/convert_llama_ggml_to_gguf.py 18.63KB
llama.cpp-master/convert_lora_to_gguf.py 14.04KB
llama.cpp-master/docs/
llama.cpp-master/docs/android.md 2.42KB
llama.cpp-master/docs/backend/
llama.cpp-master/docs/backend/BLIS.md 1.7KB
llama.cpp-master/docs/backend/SYCL.md 23.63KB
llama.cpp-master/docs/build.md 19.97KB
llama.cpp-master/docs/development/
llama.cpp-master/docs/development/HOWTO-add-model.md 4.8KB
llama.cpp-master/docs/development/debugging-tests.md 3.1KB
llama.cpp-master/docs/development/llama-star/
llama.cpp-master/docs/development/llama-star/idea-arch.key 477.14KB
llama.cpp-master/docs/development/llama-star/idea-arch.pdf 41.34KB
llama.cpp-master/docs/development/token_generation_performance_tips.md 2.25KB
llama.cpp-master/docs/docker.md 4.72KB
llama.cpp-master/docs/install.md 872B
llama.cpp-master/examples/
llama.cpp-master/examples/CMakeLists.txt 1.33KB
llama.cpp-master/examples/Miku.sh 2.57KB
llama.cpp-master/examples/baby-llama/
llama.cpp-master/examples/baby-llama/CMakeLists.txt 239B
llama.cpp-master/examples/baby-llama/baby-llama.cpp 61.05KB
llama.cpp-master/examples/base-translate.sh 1001B
llama.cpp-master/examples/batched-bench/
llama.cpp-master/examples/batched-bench/CMakeLists.txt 245B
llama.cpp-master/examples/batched-bench/README.md 2.74KB
llama.cpp-master/examples/batched-bench/batched-bench.cpp 6.63KB
llama.cpp-master/examples/batched.swift/
llama.cpp-master/examples/batched.swift/.gitignore 173B
llama.cpp-master/examples/batched.swift/Makefile 230B
llama.cpp-master/examples/batched.swift/Package.swift 766B
llama.cpp-master/examples/batched.swift/README.md 112B
llama.cpp-master/examples/batched.swift/Sources/
llama.cpp-master/examples/batched.swift/Sources/main.swift 7.57KB
llama.cpp-master/examples/batched/
llama.cpp-master/examples/batched/CMakeLists.txt 233B
llama.cpp-master/examples/batched/README.md 1.39KB
llama.cpp-master/examples/batched/batched.cpp 7.81KB
llama.cpp-master/examples/benchmark/
llama.cpp-master/examples/benchmark/CMakeLists.txt 312B
llama.cpp-master/examples/benchmark/benchmark-matmult.cpp 9.61KB
llama.cpp-master/examples/chat-13B.bat 2.39KB
llama.cpp-master/examples/chat-13B.sh 1.31KB
llama.cpp-master/examples/chat-persistent.sh 4.93KB
llama.cpp-master/examples/chat-vicuna.sh 1.3KB
llama.cpp-master/examples/chat.sh 349B
llama.cpp-master/examples/convert-llama2c-to-ggml/
llama.cpp-master/examples/convert-llama2c-to-ggml/CMakeLists.txt 265B
llama.cpp-master/examples/convert-llama2c-to-ggml/README.md 1.52KB
llama.cpp-master/examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp 34.26KB
llama.cpp-master/examples/convert_legacy_llama.py 58.15KB
llama.cpp-master/examples/cvector-generator/
llama.cpp-master/examples/cvector-generator/CMakeLists.txt 261B
llama.cpp-master/examples/cvector-generator/README.md 1.54KB
llama.cpp-master/examples/cvector-generator/completions.txt 6.75KB
llama.cpp-master/examples/cvector-generator/cvector-generator.cpp 17.91KB
llama.cpp-master/examples/cvector-generator/mean.hpp 1.49KB
llama.cpp-master/examples/cvector-generator/negative.txt 989B
llama.cpp-master/examples/cvector-generator/pca.hpp 11.43KB
llama.cpp-master/examples/cvector-generator/positive.txt 955B
llama.cpp-master/examples/deprecation-warning/
llama.cpp-master/examples/deprecation-warning/README.md 1.64KB
llama.cpp-master/examples/deprecation-warning/deprecation-warning.cpp 1.15KB
llama.cpp-master/examples/embedding/
llama.cpp-master/examples/embedding/CMakeLists.txt 237B
llama.cpp-master/examples/embedding/README.md 2.14KB
llama.cpp-master/examples/embedding/embedding.cpp 9.36KB
llama.cpp-master/examples/eval-callback/
llama.cpp-master/examples/eval-callback/CMakeLists.txt 530B
llama.cpp-master/examples/eval-callback/README.md 4.61KB
llama.cpp-master/examples/eval-callback/eval-callback.cpp 6.09KB
llama.cpp-master/examples/export-lora/
llama.cpp-master/examples/export-lora/CMakeLists.txt 241B
llama.cpp-master/examples/export-lora/README.md 1.12KB
llama.cpp-master/examples/export-lora/export-lora.cpp 15.98KB
llama.cpp-master/examples/gbnf-validator/
llama.cpp-master/examples/gbnf-validator/CMakeLists.txt 247B
llama.cpp-master/examples/gbnf-validator/gbnf-validator.cpp 4.36KB
llama.cpp-master/examples/gguf-hash/
llama.cpp-master/examples/gguf-hash/CMakeLists.txt 618B
llama.cpp-master/examples/gguf-hash/README.md 10.41KB
llama.cpp-master/examples/gguf-hash/deps/
llama.cpp-master/examples/gguf-hash/deps/rotate-bits/
llama.cpp-master/examples/gguf-hash/deps/rotate-bits/package.json 255B
llama.cpp-master/examples/gguf-hash/deps/rotate-bits/rotate-bits.h 1017B
llama.cpp-master/examples/gguf-hash/deps/sha1/
llama.cpp-master/examples/gguf-hash/deps/sha1/package.json 200B
llama.cpp-master/examples/gguf-hash/deps/sha1/sha1.c 7.44KB
llama.cpp-master/examples/gguf-hash/deps/sha1/sha1.h 717B
llama.cpp-master/examples/gguf-hash/deps/sha256/
llama.cpp-master/examples/gguf-hash/deps/sha256/package.json 283B
llama.cpp-master/examples/gguf-hash/deps/sha256/sha256.c 5.16KB
llama.cpp-master/examples/gguf-hash/deps/sha256/sha256.h 549B
llama.cpp-master/examples/gguf-hash/deps/xxhash/
llama.cpp-master/examples/gguf-hash/deps/xxhash/clib.json 255B
llama.cpp-master/examples/gguf-hash/deps/xxhash/xxhash.c 1.81KB
llama.cpp-master/examples/gguf-hash/deps/xxhash/xxhash.h 258.54KB
llama.cpp-master/examples/gguf-hash/gguf-hash.cpp 23.38KB
llama.cpp-master/examples/gguf-split/
llama.cpp-master/examples/gguf-split/CMakeLists.txt 239B
llama.cpp-master/examples/gguf-split/README.md 343B
llama.cpp-master/examples/gguf-split/gguf-split.cpp 19.4KB
llama.cpp-master/examples/gguf-split/tests.sh 2.12KB
llama.cpp-master/examples/gguf/
llama.cpp-master/examples/gguf/CMakeLists.txt 219B
llama.cpp-master/examples/gguf/gguf.cpp 7.92KB
llama.cpp-master/examples/gritlm/
llama.cpp-master/examples/gritlm/CMakeLists.txt 231B
llama.cpp-master/examples/gritlm/README.md 2.73KB
llama.cpp-master/examples/gritlm/gritlm.cpp 9.74KB
llama.cpp-master/examples/imatrix/
llama.cpp-master/examples/imatrix/CMakeLists.txt 233B
llama.cpp-master/examples/imatrix/README.md 2KB
llama.cpp-master/examples/imatrix/imatrix.cpp 22.23KB
llama.cpp-master/examples/infill/
llama.cpp-master/examples/infill/CMakeLists.txt 231B
llama.cpp-master/examples/infill/README.md 2.61KB
llama.cpp-master/examples/infill/infill.cpp 23.62KB
llama.cpp-master/examples/jeopardy/
llama.cpp-master/examples/jeopardy/README.md 1KB
llama.cpp-master/examples/jeopardy/graph.py 1.61KB
llama.cpp-master/examples/jeopardy/jeopardy.sh 851B
llama.cpp-master/examples/jeopardy/qasheet.csv 16.28KB
llama.cpp-master/examples/jeopardy/questions.txt 12.02KB
llama.cpp-master/examples/json_schema_pydantic_example.py 3.14KB
llama.cpp-master/examples/json_schema_to_grammar.py 32.91KB
llama.cpp-master/examples/llama-bench/
llama.cpp-master/examples/llama-bench/CMakeLists.txt 235B
llama.cpp-master/examples/llama-bench/README.md 13.99KB
llama.cpp-master/examples/llama-bench/llama-bench.cpp 51.46KB
llama.cpp-master/examples/llama.android/
llama.cpp-master/examples/llama.android/.gitignore 431B
llama.cpp-master/examples/llama.android/README.md
llama.cpp-master/examples/llama.android/app/
llama.cpp-master/examples/llama.android/app/.gitignore 7B
llama.cpp-master/examples/llama.android/app/build.gradle.kts 1.96KB
llama.cpp-master/examples/llama.android/app/proguard-rules.pro 751B
llama.cpp-master/examples/llama.android/app/src/
llama.cpp-master/examples/llama.android/app/src/main/
llama.cpp-master/examples/llama.android/app/src/main/AndroidManifest.xml 1.02KB
llama.cpp-master/examples/llama.android/app/src/main/java/
llama.cpp-master/examples/llama.android/app/src/main/java/com/
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/Downloadable.kt 4.42KB
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/MainActivity.kt 5.5KB
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/MainViewModel.kt 2.84KB
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/ui/
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/ui/theme/
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/ui/theme/Color.kt 282B
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/ui/theme/Theme.kt 2.14KB
llama.cpp-master/examples/llama.android/app/src/main/java/com/example/llama/ui/theme/Type.kt 987B
llama.cpp-master/examples/llama.android/app/src/main/res/
llama.cpp-master/examples/llama.android/app/src/main/res/drawable/
llama.cpp-master/examples/llama.android/app/src/main/res/drawable/ic_launcher_background.xml 5.47KB
llama.cpp-master/examples/llama.android/app/src/main/res/drawable/ic_launcher_foreground.xml 1.66KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-anydpi/
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-anydpi/ic_launcher.xml 344B
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-anydpi/ic_launcher_round.xml 344B
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-hdpi/
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-hdpi/ic_launcher.webp 1.37KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-hdpi/ic_launcher_round.webp 2.83KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-mdpi/
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-mdpi/ic_launcher.webp 982B
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-mdpi/ic_launcher_round.webp 1.73KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xhdpi/
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xhdpi/ic_launcher.webp 1.86KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xhdpi/ic_launcher_round.webp 3.83KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xxhdpi/
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xxhdpi/ic_launcher.webp 2.82KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xxhdpi/ic_launcher_round.webp 5.78KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xxxhdpi/
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xxxhdpi/ic_launcher.webp 3.75KB
llama.cpp-master/examples/llama.android/app/src/main/res/mipmap-xxxhdpi/ic_launcher_round.webp 7.6KB
llama.cpp-master/examples/llama.android/app/src/main/res/values/
llama.cpp-master/examples/llama.android/app/src/main/res/values/colors.xml 379B
llama.cpp-master/examples/llama.android/app/src/main/res/values/strings.xml 75B
llama.cpp-master/examples/llama.android/app/src/main/res/values/themes.xml 155B
llama.cpp-master/examples/llama.android/app/src/main/res/xml/
llama.cpp-master/examples/llama.android/app/src/main/res/xml/backup_rules.xml 479B
llama.cpp-master/examples/llama.android/app/src/main/res/xml/data_extraction_rules.xml 552B
llama.cpp-master/examples/llama.android/build.gradle.kts 299B
llama.cpp-master/examples/llama.android/gradle.properties 1.33KB
llama.cpp-master/examples/llama.android/gradle/
llama.cpp-master/examples/llama.android/gradle/wrapper/
llama.cpp-master/examples/llama.android/gradle/wrapper/gradle-wrapper.jar 57.82KB
llama.cpp-master/examples/llama.android/gradle/wrapper/gradle-wrapper.properties 231B
llama.cpp-master/examples/llama.android/gradlew 5.63KB
llama.cpp-master/examples/llama.android/llama/
llama.cpp-master/examples/llama.android/llama/.gitignore 7B
llama.cpp-master/examples/llama.android/llama/build.gradle.kts 1.69KB
llama.cpp-master/examples/llama.android/llama/consumer-rules.pro
llama.cpp-master/examples/llama.android/llama/proguard-rules.pro 751B
llama.cpp-master/examples/llama.android/llama/src/
llama.cpp-master/examples/llama.android/llama/src/androidTest/
llama.cpp-master/examples/llama.android/llama/src/androidTest/java/
llama.cpp-master/examples/llama.android/llama/src/androidTest/java/android/
llama.cpp-master/examples/llama.android/llama/src/androidTest/java/android/llama/
llama.cpp-master/examples/llama.android/llama/src/androidTest/java/android/llama/cpp/
llama.cpp-master/examples/llama.android/llama/src/androidTest/java/android/llama/cpp/ExampleInstrumentedTest.kt 667B
llama.cpp-master/examples/llama.android/llama/src/main/
llama.cpp-master/examples/llama.android/llama/src/main/AndroidManifest.xml 122B
llama.cpp-master/examples/llama.android/llama/src/main/cpp/
llama.cpp-master/examples/llama.android/llama/src/main/cpp/CMakeLists.txt 2.12KB
llama.cpp-master/examples/llama.android/llama/src/main/cpp/llama-android.cpp 13.28KB
llama.cpp-master/examples/llama.android/llama/src/main/java/
llama.cpp-master/examples/llama.android/llama/src/main/java/android/
llama.cpp-master/examples/llama.android/llama/src/main/java/android/llama/
llama.cpp-master/examples/llama.android/llama/src/main/java/android/llama/cpp/
llama.cpp-master/examples/llama.android/llama/src/main/java/android/llama/cpp/LLamaAndroid.kt 5.32KB
llama.cpp-master/examples/llama.android/llama/src/test/
llama.cpp-master/examples/llama.android/llama/src/test/java/
llama.cpp-master/examples/llama.android/llama/src/test/java/android/
llama.cpp-master/examples/llama.android/llama/src/test/java/android/llama/
llama.cpp-master/examples/llama.android/llama/src/test/java/android/llama/cpp/
llama.cpp-master/examples/llama.android/llama/src/test/java/android/llama/cpp/ExampleUnitTest.kt 342B
llama.cpp-master/examples/llama.android/settings.gradle.kts 349B
llama.cpp-master/examples/llama.swiftui/
llama.cpp-master/examples/llama.swiftui/.gitignore 24B
llama.cpp-master/examples/llama.swiftui/README.md 517B
llama.cpp-master/examples/llama.swiftui/llama.cpp.swift/
llama.cpp-master/examples/llama.swiftui/llama.cpp.swift/LibLlama.swift 11.24KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui.xcodeproj/
llama.cpp-master/examples/llama.swiftui/llama.swiftui.xcodeproj/project.pbxproj 18KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui.xcodeproj/project.xcworkspace/
llama.cpp-master/examples/llama.swiftui/llama.swiftui.xcodeproj/project.xcworkspace/contents.xcworkspacedata 135B
llama.cpp-master/examples/llama.swiftui/llama.swiftui.xcodeproj/project.xcworkspace/xcshareddata/
llama.cpp-master/examples/llama.swiftui/llama.swiftui.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist 244B
llama.cpp-master/examples/llama.swiftui/llama.swiftui/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Assets.xcassets/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Assets.xcassets/AppIcon.appiconset/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Assets.xcassets/AppIcon.appiconset/Contents.json 177B
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Assets.xcassets/Contents.json 63B
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Models/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Models/LlamaState.swift 6.99KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Resources/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Resources/models/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/Resources/models/.gitignore
llama.cpp-master/examples/llama.swiftui/llama.swiftui/UI/
llama.cpp-master/examples/llama.swiftui/llama.swiftui/UI/ContentView.swift 4.73KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui/UI/DownloadButton.swift 4.41KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui/UI/InputButton.swift 4.74KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui/UI/LoadCustomButton.swift 1.29KB
llama.cpp-master/examples/llama.swiftui/llama.swiftui/llama_swiftuiApp.swift 146B
llama.cpp-master/examples/llama.vim 5.05KB
llama.cpp-master/examples/llava/
llama.cpp-master/examples/llava/CMakeLists.txt 1.26KB
llama.cpp-master/examples/llava/MobileVLM-README.md 18.42KB
llama.cpp-master/examples/llava/README.md 5.18KB
llama.cpp-master/examples/llava/android/
llama.cpp-master/examples/llava/android/adb_run.sh 2.55KB
llama.cpp-master/examples/llava/android/build_64.sh 207B
llama.cpp-master/examples/llava/clip.cpp 85.03KB
llama.cpp-master/examples/llava/clip.h 2.83KB
llama.cpp-master/examples/llava/convert_image_encoder_to_gguf.py 13.58KB
llama.cpp-master/examples/llava/llava-cli.cpp 12.57KB
llama.cpp-master/examples/llava/llava.cpp 17.95KB
llama.cpp-master/examples/llava/llava.h 1.79KB
llama.cpp-master/examples/llava/llava_surgery.py 1.34KB
llama.cpp-master/examples/llava/llava_surgery_v2.py 6.89KB
llama.cpp-master/examples/llava/requirements.txt 143B
llama.cpp-master/examples/llm.vim 921B
llama.cpp-master/examples/lookahead/
llama.cpp-master/examples/lookahead/CMakeLists.txt 237B
llama.cpp-master/examples/lookahead/README.md 195B
llama.cpp-master/examples/lookahead/lookahead.cpp 16.12KB
llama.cpp-master/examples/lookup/
llama.cpp-master/examples/lookup/CMakeLists.txt 965B
llama.cpp-master/examples/lookup/README.md 487B
llama.cpp-master/examples/lookup/lookup-create.cpp 1.17KB
llama.cpp-master/examples/lookup/lookup-merge.cpp 1.34KB
llama.cpp-master/examples/lookup/lookup-stats.cpp 5.6KB
llama.cpp-master/examples/lookup/lookup.cpp 8.37KB
llama.cpp-master/examples/main-cmake-pkg/
llama.cpp-master/examples/main-cmake-pkg/.gitignore 387B
llama.cpp-master/examples/main-cmake-pkg/CMakeLists.txt 1.25KB
llama.cpp-master/examples/main-cmake-pkg/README.md 1.25KB
llama.cpp-master/examples/main/
llama.cpp-master/examples/main/CMakeLists.txt 226B
llama.cpp-master/examples/main/README.md 27.16KB
llama.cpp-master/examples/main/main.cpp 39.6KB
llama.cpp-master/examples/parallel/
llama.cpp-master/examples/parallel/CMakeLists.txt 235B
llama.cpp-master/examples/parallel/README.md 93B
llama.cpp-master/examples/parallel/parallel.cpp 15.49KB
llama.cpp-master/examples/passkey/
llama.cpp-master/examples/passkey/CMakeLists.txt 233B
llama.cpp-master/examples/passkey/README.md 409B
llama.cpp-master/examples/passkey/passkey.cpp 8.7KB
llama.cpp-master/examples/perplexity/
llama.cpp-master/examples/perplexity/CMakeLists.txt 239B
llama.cpp-master/examples/perplexity/README.md 19.51KB
llama.cpp-master/examples/perplexity/perplexity.cpp 79.32KB
llama.cpp-master/examples/pydantic_models_to_grammar.py 54.87KB
llama.cpp-master/examples/pydantic_models_to_grammar_examples.py 13.4KB
llama.cpp-master/examples/quantize-stats/
llama.cpp-master/examples/quantize-stats/CMakeLists.txt 310B
llama.cpp-master/examples/quantize-stats/quantize-stats.cpp 15.66KB
llama.cpp-master/examples/quantize/
llama.cpp-master/examples/quantize/CMakeLists.txt 294B
llama.cpp-master/examples/quantize/README.md 5.05KB
llama.cpp-master/examples/quantize/quantize.cpp 18.61KB
llama.cpp-master/examples/quantize/tests.sh 1.5KB
llama.cpp-master/examples/reason-act.sh 355B
llama.cpp-master/examples/regex_to_grammar.py 431B
llama.cpp-master/examples/retrieval/
llama.cpp-master/examples/retrieval/CMakeLists.txt 237B
llama.cpp-master/examples/retrieval/README.md 2.09KB
llama.cpp-master/examples/retrieval/retrieval.cpp 10.11KB
llama.cpp-master/examples/rpc/
llama.cpp-master/examples/rpc/CMakeLists.txt 95B
llama.cpp-master/examples/rpc/README.md 2.28KB
llama.cpp-master/examples/rpc/rpc-server.cpp 4.15KB
llama.cpp-master/examples/save-load-state/
llama.cpp-master/examples/save-load-state/CMakeLists.txt 249B
llama.cpp-master/examples/save-load-state/save-load-state.cpp 8.36KB
llama.cpp-master/examples/server-llama2-13B.sh 790B
llama.cpp-master/examples/server/
llama.cpp-master/examples/server/CMakeLists.txt 1.79KB
llama.cpp-master/examples/server/README.md 42.75KB
llama.cpp-master/examples/server/bench/
llama.cpp-master/examples/server/bench/README.md 4.2KB
llama.cpp-master/examples/server/bench/bench.py 12.99KB
llama.cpp-master/examples/server/bench/prometheus.yml 183B
llama.cpp-master/examples/server/bench/requirements.txt 20B
llama.cpp-master/examples/server/bench/script.js 5.76KB
llama.cpp-master/examples/server/chat-llama2.sh 2.46KB
llama.cpp-master/examples/server/chat.mjs 3.79KB
llama.cpp-master/examples/server/chat.sh 1.93KB
llama.cpp-master/examples/server/deps.sh 374B
llama.cpp-master/examples/server/httplib.h 303.63KB
llama.cpp-master/examples/server/public/
llama.cpp-master/examples/server/public/colorthemes.css 11.12KB
llama.cpp-master/examples/server/public/completion.js 5.81KB
llama.cpp-master/examples/server/public/favicon.ico 4.03KB
llama.cpp-master/examples/server/public/index-new.html 47.66KB
llama.cpp-master/examples/server/public/index.html 41.58KB
llama.cpp-master/examples/server/public/index.js 22.53KB
llama.cpp-master/examples/server/public/json-schema-to-grammar.mjs 28.5KB
llama.cpp-master/examples/server/public/prompt-formats.js 6.02KB
llama.cpp-master/examples/server/public/style.css 19.6KB
llama.cpp-master/examples/server/public/system-prompts.js 10.5KB
llama.cpp-master/examples/server/public/theme-beeninorder.css 6.95KB
llama.cpp-master/examples/server/public/theme-ketivah.css 7.14KB
llama.cpp-master/examples/server/public/theme-mangotango.css 6.58KB
llama.cpp-master/examples/server/public/theme-playground.css 6.83KB
llama.cpp-master/examples/server/public/theme-polarnight.css 8.01KB
llama.cpp-master/examples/server/public/theme-snowstorm.css 8KB
llama.cpp-master/examples/server/public_simplechat/
llama.cpp-master/examples/server/public_simplechat/datautils.mjs 8.94KB
llama.cpp-master/examples/server/public_simplechat/index.html 1.91KB
llama.cpp-master/examples/server/public_simplechat/readme.md 14.42KB
llama.cpp-master/examples/server/public_simplechat/simplechat.css 1KB
llama.cpp-master/examples/server/public_simplechat/simplechat.js 30.53KB
llama.cpp-master/examples/server/public_simplechat/simplechat_screens.webp 20.88KB
llama.cpp-master/examples/server/public_simplechat/ui.mjs 5.94KB
llama.cpp-master/examples/server/server.cpp 139.03KB
llama.cpp-master/examples/server/tests/
llama.cpp-master/examples/server/tests/README.md 2.79KB
llama.cpp-master/examples/server/tests/features/
llama.cpp-master/examples/server/tests/features/embeddings.feature 2.42KB
llama.cpp-master/examples/server/tests/features/environment.py 2.53KB
llama.cpp-master/examples/server/tests/features/issues.feature 139B
llama.cpp-master/examples/server/tests/features/lora.feature 1.14KB
llama.cpp-master/examples/server/tests/features/parallel.feature 2.7KB
llama.cpp-master/examples/server/tests/features/passkey.feature 2.66KB
llama.cpp-master/examples/server/tests/features/results.feature 4.24KB
llama.cpp-master/examples/server/tests/features/security.feature 2.48KB
llama.cpp-master/examples/server/tests/features/server.feature 4.94KB
llama.cpp-master/examples/server/tests/features/slotsave.feature 2.41KB
llama.cpp-master/examples/server/tests/features/steps/
llama.cpp-master/examples/server/tests/features/steps/steps.py 54.7KB
llama.cpp-master/examples/server/tests/features/wrong_usages.feature 794B
llama.cpp-master/examples/server/tests/requirements.txt 125B
llama.cpp-master/examples/server/tests/tests.sh 197B
llama.cpp-master/examples/server/themes/
llama.cpp-master/examples/server/themes/README.md 182B
llama.cpp-master/examples/server/themes/buttons-top/
llama.cpp-master/examples/server/themes/buttons-top/README.md 260B
llama.cpp-master/examples/server/themes/buttons-top/buttons_top.png 116.94KB
llama.cpp-master/examples/server/themes/buttons-top/favicon.ico 4.03KB
llama.cpp-master/examples/server/themes/buttons-top/index.html 33.74KB
llama.cpp-master/examples/server/themes/wild/
llama.cpp-master/examples/server/themes/wild/README.md 127B
llama.cpp-master/examples/server/themes/wild/favicon.ico 4.03KB
llama.cpp-master/examples/server/themes/wild/index.html 33.86KB
llama.cpp-master/examples/server/themes/wild/llama_cpp.png 74.69KB
llama.cpp-master/examples/server/themes/wild/llamapattern.png 253.5KB
llama.cpp-master/examples/server/themes/wild/wild.png 484.83KB
llama.cpp-master/examples/server/utils.hpp 21.02KB
llama.cpp-master/examples/server_embd.py 971B
llama.cpp-master/examples/simple/
llama.cpp-master/examples/simple/CMakeLists.txt 231B
llama.cpp-master/examples/simple/README.md 915B
llama.cpp-master/examples/simple/simple.cpp 4.87KB
llama.cpp-master/examples/speculative/
llama.cpp-master/examples/speculative/CMakeLists.txt 241B
llama.cpp-master/examples/speculative/README.md 285B
llama.cpp-master/examples/speculative/speculative.cpp 23.86KB
llama.cpp-master/examples/sycl/
llama.cpp-master/examples/sycl/CMakeLists.txt 335B
llama.cpp-master/examples/sycl/README.md 1.43KB
llama.cpp-master/examples/sycl/build.sh 582B
llama.cpp-master/examples/sycl/ls-sycl-device.cpp 195B
llama.cpp-master/examples/sycl/run-llama2.sh 1.23KB
llama.cpp-master/examples/sycl/win-build-sycl.bat 845B
llama.cpp-master/examples/sycl/win-run-llama2.bat 330B
llama.cpp-master/examples/tokenize/
llama.cpp-master/examples/tokenize/CMakeLists.txt 235B
llama.cpp-master/examples/tokenize/tokenize.cpp 13.4KB
llama.cpp-master/examples/ts-type-to-grammar.sh 920B
llama.cpp-master/flake.lock 1.52KB
llama.cpp-master/flake.nix 7.18KB
llama.cpp-master/ggml/
llama.cpp-master/ggml/.gitignore 56B
llama.cpp-master/ggml/CMakeLists.txt 9.36KB
llama.cpp-master/ggml/cmake/
llama.cpp-master/ggml/cmake/FindSIMD.cmake 2.59KB
llama.cpp-master/ggml/include/
llama.cpp-master/ggml/include/ggml-alloc.h 2.92KB
llama.cpp-master/ggml/include/ggml-backend.h 13.38KB
llama.cpp-master/ggml/include/ggml-blas.h 526B
llama.cpp-master/ggml/include/ggml-cann.h 4.55KB
llama.cpp-master/ggml/include/ggml-cuda.h 1.59KB
llama.cpp-master/ggml/include/ggml-kompute.h 1KB
llama.cpp-master/ggml/include/ggml-metal.h 2.25KB
llama.cpp-master/ggml/include/ggml-rpc.h 673B
llama.cpp-master/ggml/include/ggml-sycl.h 1.46KB
llama.cpp-master/ggml/include/ggml-vulkan.h 946B
llama.cpp-master/ggml/include/ggml.h 89.94KB
llama.cpp-master/ggml/src/
llama.cpp-master/ggml/src/CMakeLists.txt 50.03KB
llama.cpp-master/ggml/src/ggml-aarch64.c 91.2KB
llama.cpp-master/ggml/src/ggml-aarch64.h 1.96KB
llama.cpp-master/ggml/src/ggml-alloc.c 37.49KB
llama.cpp-master/ggml/src/ggml-backend-impl.h 7.43KB
llama.cpp-master/ggml/src/ggml-backend.c 83.33KB
llama.cpp-master/ggml/src/ggml-blas.cpp 12.15KB
llama.cpp-master/ggml/src/ggml-cann.cpp 69.96KB
llama.cpp-master/ggml/src/ggml-cann/
llama.cpp-master/ggml/src/ggml-cann/.clang-format 4.42KB
llama.cpp-master/ggml/src/ggml-cann/Doxyfile 109.97KB
llama.cpp-master/ggml/src/ggml-cann/acl_tensor.cpp 6.92KB
llama.cpp-master/ggml/src/ggml-cann/acl_tensor.h 12.17KB
llama.cpp-master/ggml/src/ggml-cann/aclnn_ops.cpp 122.73KB
llama.cpp-master/ggml/src/ggml-cann/aclnn_ops.h 25.25KB
llama.cpp-master/ggml/src/ggml-cann/common.h 9.25KB
llama.cpp-master/ggml/src/ggml-cann/kernels/
llama.cpp-master/ggml/src/ggml-cann/kernels/CMakeLists.txt 1.03KB
llama.cpp-master/ggml/src/ggml-cann/kernels/ascendc_kernels.h 693B
llama.cpp-master/ggml/src/ggml-cann/kernels/dup.cpp 7.99KB
llama.cpp-master/ggml/src/ggml-cann/kernels/get_row_f16.cpp 6.79KB
llama.cpp-master/ggml/src/ggml-cann/kernels/get_row_f32.cpp 6.53KB
llama.cpp-master/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp 7.03KB
llama.cpp-master/ggml/src/ggml-cann/kernels/get_row_q8_0.cpp 6.96KB
llama.cpp-master/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp 7.37KB
llama.cpp-master/ggml/src/ggml-cann/kernels/quantize_f32_q8_0.cpp 7.31KB
llama.cpp-master/ggml/src/ggml-cann/kernels/quantize_float_to_q4_0.cpp 10.67KB
llama.cpp-master/ggml/src/ggml-common.h 129.8KB
llama.cpp-master/ggml/src/ggml-cuda.cu 120.79KB
llama.cpp-master/ggml/src/ggml-cuda/
llama.cpp-master/ggml/src/ggml-cuda/acc.cu 1.93KB
llama.cpp-master/ggml/src/ggml-cuda/acc.cuh 131B
llama.cpp-master/ggml/src/ggml-cuda/arange.cu 1.19KB
llama.cpp-master/ggml/src/ggml-cuda/arange.cuh 137B
llama.cpp-master/ggml/src/ggml-cuda/argsort.cu 3.35KB
llama.cpp-master/ggml/src/ggml-cuda/argsort.cuh 102B
llama.cpp-master/ggml/src/ggml-cuda/binbcast.cu 10.29KB
llama.cpp-master/ggml/src/ggml-cuda/binbcast.cuh 326B
llama.cpp-master/ggml/src/ggml-cuda/clamp.cu 1.14KB
llama.cpp-master/ggml/src/ggml-cuda/clamp.cuh 135B
llama.cpp-master/ggml/src/ggml-cuda/common.cuh 20.76KB
llama.cpp-master/ggml/src/ggml-cuda/concat.cu 6.35KB
llama.cpp-master/ggml/src/ggml-cuda/concat.cuh 137B
llama.cpp-master/ggml/src/ggml-cuda/conv-transpose-1d.cu 3.25KB
llama.cpp-master/ggml/src/ggml-cuda/conv-transpose-1d.cuh 158B
llama.cpp-master/ggml/src/ggml-cuda/convert.cu 25.03KB
llama.cpp-master/ggml/src/ggml-cuda/convert.cuh 391B
llama.cpp-master/ggml/src/ggml-cuda/cpy.cu 19.95KB
llama.cpp-master/ggml/src/ggml-cuda/cpy.cuh 298B
llama.cpp-master/ggml/src/ggml-cuda/dequantize.cuh 2.59KB
llama.cpp-master/ggml/src/ggml-cuda/diagmask.cu 1.72KB
llama.cpp-master/ggml/src/ggml-cuda/diagmask.cuh 150B
llama.cpp-master/ggml/src/ggml-cuda/dmmv.cu 27.48KB
llama.cpp-master/ggml/src/ggml-cuda/dmmv.cuh 642B
llama.cpp-master/ggml/src/ggml-cuda/fattn-common.cuh 23.67KB
llama.cpp-master/ggml/src/ggml-cuda/fattn-tile-f16.cu 11.13KB
llama.cpp-master/ggml/src/ggml-cuda/fattn-tile-f16.cuh 115B
llama.cpp-master/ggml/src/ggml-cuda/fattn-tile-f32.cu 11.05KB
llama.cpp-master/ggml/src/ggml-cuda/fattn-tile-f32.cuh 115B
llama.cpp-master/ggml/src/ggml-cuda/fattn-vec-f16.cuh 14.64KB
llama.cpp-master/ggml/src/ggml-cuda/fattn-vec-f32.cuh 13.71KB
llama.cpp-master/ggml/src/ggml-cuda/fattn-wmma-f16.cuh 20.11KB
llama.cpp-master/ggml/src/ggml-cuda/fattn.cu 13.84KB
llama.cpp-master/ggml/src/ggml-cuda/fattn.cuh 106B
llama.cpp-master/ggml/src/ggml-cuda/getrows.cu 6.83KB
llama.cpp-master/ggml/src/ggml-cuda/getrows.cuh 141B
llama.cpp-master/ggml/src/ggml-cuda/im2col.cu 4.45KB
llama.cpp-master/ggml/src/ggml-cuda/im2col.cuh 137B
llama.cpp-master/ggml/src/ggml-cuda/mma.cuh 7.41KB
llama.cpp-master/ggml/src/ggml-cuda/mmq.cu 4.58KB
llama.cpp-master/ggml/src/ggml-cuda/mmq.cuh 110.48KB
llama.cpp-master/ggml/src/ggml-cuda/mmvq.cu 18.78KB
llama.cpp-master/ggml/src/ggml-cuda/mmvq.cuh 481B
llama.cpp-master/ggml/src/ggml-cuda/norm.cu 7.03KB
llama.cpp-master/ggml/src/ggml-cuda/norm.cuh 263B
llama.cpp-master/ggml/src/ggml-cuda/pad.cu 1.75KB
llama.cpp-master/ggml/src/ggml-cuda/pad.cuh 131B
llama.cpp-master/ggml/src/ggml-cuda/pool2d.cu 3.23KB
llama.cpp-master/ggml/src/ggml-cuda/pool2d.cuh 137B
llama.cpp-master/ggml/src/ggml-cuda/quantize.cu 5.34KB
llama.cpp-master/ggml/src/ggml-cuda/quantize.cuh 979B
llama.cpp-master/ggml/src/ggml-cuda/rope.cu 10.46KB
llama.cpp-master/ggml/src/ggml-cuda/rope.cuh 133B
llama.cpp-master/ggml/src/ggml-cuda/scale.cu 1021B
llama.cpp-master/ggml/src/ggml-cuda/scale.cuh 135B
llama.cpp-master/ggml/src/ggml-cuda/softmax.cu 7.54KB
llama.cpp-master/ggml/src/ggml-cuda/softmax.cuh 142B
llama.cpp-master/ggml/src/ggml-cuda/sumrows.cu 1.17KB
llama.cpp-master/ggml/src/ggml-cuda/sumrows.cuh 103B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-q4_0.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-q4_1.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-q5_0.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-q5_1.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-q8_0.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_1-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_1-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_1-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_1-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_1-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_1-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_0-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_0-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_0-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_0-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_0-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_0-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_1-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_1-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_1-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_1-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_1-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q5_1-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu 176B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-q4_0.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-q4_1.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-q5_0.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-q5_1.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-q8_0.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-q4_0.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-q4_1.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-q5_0.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-q5_1.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-q8_0.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_1-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_1-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_1-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_1-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_1-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_1-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_0-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_0-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_0-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_0-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_0-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_0-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_1-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_1-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_1-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_1-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_1-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q5_1-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-f16.cu 178B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q4_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q4_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q5_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q5_1.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu 179B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu 176B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-q4_0.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-q4_1.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-q5_0.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-q5_1.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-q8_0.cu 177B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqfloat-cpb16.cu 367B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqfloat-cpb32.cu 325B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqhalf-cpb16.cu 361B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqhalf-cpb32.cu 361B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqhalf-cpb8.cu 276B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/generate_cu_files.py 2.76KB
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu 139B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu 139B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu 140B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu 141B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu 139B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu 141B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu 140B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu 140B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu 138B
llama.cpp-master/ggml/src/ggml-cuda/tsembd.cu 1.76KB
llama.cpp-master/ggml/src/ggml-cuda/tsembd.cuh 161B
llama.cpp-master/ggml/src/ggml-cuda/unary.cu 10.58KB
llama.cpp-master/ggml/src/ggml-cuda/unary.cuh 1.18KB
llama.cpp-master/ggml/src/ggml-cuda/upscale.cu 2.07KB
llama.cpp-master/ggml/src/ggml-cuda/upscale.cuh 139B
llama.cpp-master/ggml/src/ggml-cuda/vecdotq.cuh 38.2KB
llama.cpp-master/ggml/src/ggml-cuda/vendors/
llama.cpp-master/ggml/src/ggml-cuda/vendors/cuda.h 462B
llama.cpp-master/ggml/src/ggml-cuda/vendors/hip.h 7.1KB
llama.cpp-master/ggml/src/ggml-cuda/vendors/musa.h 7.04KB
llama.cpp-master/ggml/src/ggml-impl.h 20.06KB
llama.cpp-master/ggml/src/ggml-kompute.cpp 79.01KB
llama.cpp-master/ggml/src/ggml-metal.m 187.1KB
llama.cpp-master/ggml/src/ggml-metal.metal 221.39KB
llama.cpp-master/ggml/src/ggml-quants.c 637.5KB
llama.cpp-master/ggml/src/ggml-quants.h 11.48KB
llama.cpp-master/ggml/src/ggml-rpc.cpp 43.35KB
llama.cpp-master/ggml/src/ggml-sycl.cpp 209.44KB
llama.cpp-master/ggml/src/ggml-sycl/
llama.cpp-master/ggml/src/ggml-sycl/backend.hpp 653B
llama.cpp-master/ggml/src/ggml-sycl/common.cpp 1.49KB
llama.cpp-master/ggml/src/ggml-sycl/common.hpp 10.68KB
llama.cpp-master/ggml/src/ggml-sycl/concat.cpp 7.39KB
llama.cpp-master/ggml/src/ggml-sycl/concat.hpp 575B
llama.cpp-master/ggml/src/ggml-sycl/conv.cpp 3.16KB
llama.cpp-master/ggml/src/ggml-sycl/conv.hpp 550B
llama.cpp-master/ggml/src/ggml-sycl/convert.cpp 21.31KB
llama.cpp-master/ggml/src/ggml-sycl/convert.hpp 778B
llama.cpp-master/ggml/src/ggml-sycl/dequantize.hpp 22.99KB
llama.cpp-master/ggml/src/ggml-sycl/dmmv.cpp 40.63KB
llama.cpp-master/ggml/src/ggml-sycl/dmmv.hpp 808B
llama.cpp-master/ggml/src/ggml-sycl/dpct/
llama.cpp-master/ggml/src/ggml-sycl/dpct/helper.hpp 120.39KB
llama.cpp-master/ggml/src/ggml-sycl/mmq.cpp 116.7KB
llama.cpp-master/ggml/src/ggml-sycl/mmq.hpp 819B
llama.cpp-master/ggml/src/ggml-sycl/mmvq.cpp 38.85KB
llama.cpp-master/ggml/src/ggml-sycl/mmvq.hpp 799B
llama.cpp-master/ggml/src/ggml-sycl/norm.cpp 13.23KB
llama.cpp-master/ggml/src/ggml-sycl/norm.hpp 1.08KB
llama.cpp-master/ggml/src/ggml-sycl/presets.hpp 1.97KB
llama.cpp-master/ggml/src/ggml-sycl/rope.cpp 10.38KB
llama.cpp-master/ggml/src/ggml-sycl/rope.hpp 633B
llama.cpp-master/ggml/src/ggml-sycl/softmax.cpp 10.88KB
llama.cpp-master/ggml/src/ggml-sycl/softmax.hpp 652B
llama.cpp-master/ggml/src/ggml-sycl/tsembd.cpp 2.53KB
llama.cpp-master/ggml/src/ggml-sycl/tsembd.hpp 560B
llama.cpp-master/ggml/src/ggml-sycl/vecdotq.hpp 38.82KB
llama.cpp-master/ggml/src/ggml-vulkan.cpp 405.37KB
llama.cpp-master/ggml/src/ggml.c 713.38KB
llama.cpp-master/ggml/src/kompute/
llama.cpp-master/ggml/src/kompute-shaders/
llama.cpp-master/ggml/src/kompute-shaders/common.comp 3.53KB
llama.cpp-master/ggml/src/kompute-shaders/op_add.comp 1.61KB
llama.cpp-master/ggml/src/kompute-shaders/op_addrow.comp 640B
llama.cpp-master/ggml/src/kompute-shaders/op_cpy_f16_f16.comp 1.5KB
llama.cpp-master/ggml/src/kompute-shaders/op_cpy_f16_f32.comp 1.49KB
llama.cpp-master/ggml/src/kompute-shaders/op_cpy_f32_f16.comp 1.49KB
llama.cpp-master/ggml/src/kompute-shaders/op_cpy_f32_f32.comp 1.49KB
llama.cpp-master/ggml/src/kompute-shaders/op_diagmask.comp 726B
llama.cpp-master/ggml/src/kompute-shaders/op_gelu.comp 604B
llama.cpp-master/ggml/src/kompute-shaders/op_getrows.comp 609B
llama.cpp-master/ggml/src/kompute-shaders/op_getrows_f16.comp 787B
llama.cpp-master/ggml/src/kompute-shaders/op_getrows_f32.comp 762B
llama.cpp-master/ggml/src/kompute-shaders/op_getrows_q4_0.comp 919B
llama.cpp-master/ggml/src/kompute-shaders/op_getrows_q4_1.comp 962B
llama.cpp-master/ggml/src/kompute-shaders/op_getrows_q6_k.comp 1.16KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul.comp 1.33KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mat_f16.comp 1.59KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mat_mat_f32.comp 1.27KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mat_q4_0.comp 1018B
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mat_q4_1.comp 1.04KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mat_q6_k.comp 3.54KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mat_q8_0.comp 2.19KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mv_q_n.comp 1.76KB
llama.cpp-master/ggml/src/kompute-shaders/op_mul_mv_q_n_pre.comp 521B
llama.cpp-master/ggml/src/kompute-shaders/op_norm.comp 2.25KB
llama.cpp-master/ggml/src/kompute-shaders/op_relu.comp 508B
llama.cpp-master/ggml/src/kompute-shaders/op_rmsnorm.comp 1.38KB
llama.cpp-master/ggml/src/kompute-shaders/op_rope_f16.comp 2.84KB
llama.cpp-master/ggml/src/kompute-shaders/op_rope_f32.comp 2.74KB
llama.cpp-master/ggml/src/kompute-shaders/op_scale.comp 432B
llama.cpp-master/ggml/src/kompute-shaders/op_scale_8.comp 528B
llama.cpp-master/ggml/src/kompute-shaders/op_silu.comp 543B
llama.cpp-master/ggml/src/kompute-shaders/op_softmax.comp 1.75KB
llama.cpp-master/ggml/src/kompute-shaders/rope_common.comp 2.25KB
llama.cpp-master/ggml/src/llamafile/
llama.cpp-master/ggml/src/llamafile/sgemm.cpp 30.9KB
llama.cpp-master/ggml/src/llamafile/sgemm.h 302B
llama.cpp-master/ggml/src/vulkan-shaders/
llama.cpp-master/ggml/src/vulkan-shaders/CMakeLists.txt 268B
llama.cpp-master/ggml/src/vulkan-shaders/add.comp 287B
llama.cpp-master/ggml/src/vulkan-shaders/argsort.comp 1.96KB
llama.cpp-master/ggml/src/vulkan-shaders/clamp.comp 340B
llama.cpp-master/ggml/src/vulkan-shaders/concat.comp 1.25KB
llama.cpp-master/ggml/src/vulkan-shaders/copy.comp 352B
llama.cpp-master/ggml/src/vulkan-shaders/dequant_f32.comp 442B
llama.cpp-master/ggml/src/vulkan-shaders/dequant_funcs.comp 2.33KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_head.comp 249B
llama.cpp-master/ggml/src/vulkan-shaders/dequant_iq4_nl.comp 871B
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q2_k.comp 1.44KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q3_k.comp 1.68KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q4_0.comp 861B
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q4_1.comp 892B
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q4_k.comp 1.97KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q5_0.comp 1.02KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q5_1.comp 1.02KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q5_k.comp 2.41KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q6_k.comp 1.39KB
llama.cpp-master/ggml/src/vulkan-shaders/dequant_q8_0.comp 839B
llama.cpp-master/ggml/src/vulkan-shaders/diag_mask_inf.comp 799B
llama.cpp-master/ggml/src/vulkan-shaders/div.comp 287B
llama.cpp-master/ggml/src/vulkan-shaders/gelu.comp 767B
llama.cpp-master/ggml/src/vulkan-shaders/gelu_quick.comp 631B
llama.cpp-master/ggml/src/vulkan-shaders/generic_binary_head.comp 2.11KB
llama.cpp-master/ggml/src/vulkan-shaders/generic_head.comp 160B
llama.cpp-master/ggml/src/vulkan-shaders/generic_unary_head.comp 1.48KB
llama.cpp-master/ggml/src/vulkan-shaders/get_rows.comp 702B
llama.cpp-master/ggml/src/vulkan-shaders/get_rows_quant.comp 970B
llama.cpp-master/ggml/src/vulkan-shaders/group_norm.comp 1.68KB
llama.cpp-master/ggml/src/vulkan-shaders/im2col.comp 1.49KB
llama.cpp-master/ggml/src/vulkan-shaders/leaky_relu.comp 586B
llama.cpp-master/ggml/src/vulkan-shaders/mul.comp 287B
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_split_k_reduce.comp 592B
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec.comp 1.62KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_base.comp 1.75KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_nc.comp 1.76KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_p021.comp 1.86KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_q2_k.comp 4.33KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_q3_k.comp 4.36KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_q4_k.comp 8.02KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_q5_k.comp 6.92KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mat_vec_q6_k.comp 4.91KB
llama.cpp-master/ggml/src/vulkan-shaders/mul_mm.comp 21.02KB
llama.cpp-master/ggml/src/vulkan-shaders/norm.comp 1.25KB
llama.cpp-master/ggml/src/vulkan-shaders/pad.comp 884B
llama.cpp-master/ggml/src/vulkan-shaders/relu.comp 520B
llama.cpp-master/ggml/src/vulkan-shaders/rms_norm.comp 1.25KB
llama.cpp-master/ggml/src/vulkan-shaders/rope_head.comp 1.47KB
llama.cpp-master/ggml/src/vulkan-shaders/rope_neox.comp 929B
llama.cpp-master/ggml/src/vulkan-shaders/rope_norm.comp 901B
llama.cpp-master/ggml/src/vulkan-shaders/scale.comp 273B
llama.cpp-master/ggml/src/vulkan-shaders/silu.comp 565B
llama.cpp-master/ggml/src/vulkan-shaders/soft_max.comp 2.61KB
llama.cpp-master/ggml/src/vulkan-shaders/square.comp 288B
llama.cpp-master/ggml/src/vulkan-shaders/sum_rows.comp 940B
llama.cpp-master/ggml/src/vulkan-shaders/tanh.comp 519B
llama.cpp-master/ggml/src/vulkan-shaders/timestep_embedding.comp 1KB
llama.cpp-master/ggml/src/vulkan-shaders/types.comp 3.44KB
llama.cpp-master/ggml/src/vulkan-shaders/upscale.comp 1.07KB
llama.cpp-master/ggml/src/vulkan-shaders/vulkan-shaders-gen.cpp 22.18KB
llama.cpp-master/gguf-py/
llama.cpp-master/gguf-py/LICENSE 1.05KB
llama.cpp-master/gguf-py/README.md 2.66KB
llama.cpp-master/gguf-py/examples/
llama.cpp-master/gguf-py/examples/reader.py 1.54KB
llama.cpp-master/gguf-py/examples/writer.py 1.09KB
llama.cpp-master/gguf-py/gguf/
llama.cpp-master/gguf-py/gguf/__init__.py 219B
llama.cpp-master/gguf-py/gguf/constants.py 45.91KB
llama.cpp-master/gguf-py/gguf/gguf.py 478B
llama.cpp-master/gguf-py/gguf/gguf_reader.py 12.08KB
llama.cpp-master/gguf-py/gguf/gguf_writer.py 34.14KB
llama.cpp-master/gguf-py/gguf/lazy.py 8.33KB
llama.cpp-master/gguf-py/gguf/metadata.py 25.12KB
llama.cpp-master/gguf-py/gguf/py.typed
llama.cpp-master/gguf-py/gguf/quants.py 4.21KB
llama.cpp-master/gguf-py/gguf/tensor_mapping.py 29.44KB
llama.cpp-master/gguf-py/gguf/utility.py 2.87KB
llama.cpp-master/gguf-py/gguf/vocab.py 18.6KB
llama.cpp-master/gguf-py/pyproject.toml 1013B
llama.cpp-master/gguf-py/scripts/
llama.cpp-master/gguf-py/scripts/__init__.py 297B
llama.cpp-master/gguf-py/scripts/gguf_convert_endian.py 5.16KB
llama.cpp-master/gguf-py/scripts/gguf_dump.py 21.42KB
llama.cpp-master/gguf-py/scripts/gguf_hash.py 3.62KB
llama.cpp-master/gguf-py/scripts/gguf_new_metadata.py 10.46KB
llama.cpp-master/gguf-py/scripts/gguf_set_metadata.py 4.03KB
llama.cpp-master/gguf-py/tests/
llama.cpp-master/gguf-py/tests/__init__.py 29B
llama.cpp-master/gguf-py/tests/test_metadata.py 12.46KB
llama.cpp-master/grammars/
llama.cpp-master/grammars/README.md 16.44KB
llama.cpp-master/grammars/arithmetic.gbnf 177B
llama.cpp-master/grammars/c.gbnf 1.35KB
llama.cpp-master/grammars/chess.gbnf 565B
llama.cpp-master/grammars/japanese.gbnf 249B
llama.cpp-master/grammars/json.gbnf 601B
llama.cpp-master/grammars/json_arr.gbnf 796B
llama.cpp-master/grammars/list.gbnf 109B
llama.cpp-master/include/
llama.cpp-master/include/llama.h 56.21KB
llama.cpp-master/media/
llama.cpp-master/media/llama-leader.jpeg 195.26KB
llama.cpp-master/media/llama0-banner.png 141.23KB
llama.cpp-master/media/llama0-logo.png 175.72KB
llama.cpp-master/media/llama1-banner.png 32.55KB
llama.cpp-master/media/llama1-logo.png 31.73KB
llama.cpp-master/media/matmul.png 259.48KB
llama.cpp-master/media/matmul.svg 51.38KB
llama.cpp-master/models/
llama.cpp-master/models/.editorconfig 12B
llama.cpp-master/models/ggml-vocab-aquila.gguf 4.6MB
llama.cpp-master/models/ggml-vocab-baichuan.gguf 1.28MB
llama.cpp-master/models/ggml-vocab-bert-bge.gguf 612.84KB
llama.cpp-master/models/ggml-vocab-bert-bge.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-bert-bge.gguf.out 1.59KB
llama.cpp-master/models/ggml-vocab-command-r.gguf 10.37MB
llama.cpp-master/models/ggml-vocab-command-r.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-command-r.gguf.out 1.86KB
llama.cpp-master/models/ggml-vocab-deepseek-coder.gguf 1.1MB
llama.cpp-master/models/ggml-vocab-deepseek-coder.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-deepseek-coder.gguf.out 2.05KB
llama.cpp-master/models/ggml-vocab-deepseek-llm.gguf 3.79MB
llama.cpp-master/models/ggml-vocab-deepseek-llm.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-deepseek-llm.gguf.out 1.88KB
llama.cpp-master/models/ggml-vocab-falcon.gguf 2.18MB
llama.cpp-master/models/ggml-vocab-falcon.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-falcon.gguf.out 1.96KB
llama.cpp-master/models/ggml-vocab-gpt-2.gguf 1.68MB
llama.cpp-master/models/ggml-vocab-gpt-2.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-gpt-2.gguf.out 2.1KB
llama.cpp-master/models/ggml-vocab-gpt-neox.gguf 1.69MB
llama.cpp-master/models/ggml-vocab-llama-bpe.gguf 7.46MB
llama.cpp-master/models/ggml-vocab-llama-bpe.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-llama-bpe.gguf.out 1.7KB
llama.cpp-master/models/ggml-vocab-llama-spm.gguf 706.9KB
llama.cpp-master/models/ggml-vocab-llama-spm.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-llama-spm.gguf.out 2.62KB
llama.cpp-master/models/ggml-vocab-mpt.gguf 1.69MB
llama.cpp-master/models/ggml-vocab-mpt.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-mpt.gguf.out 1.85KB
llama.cpp-master/models/ggml-vocab-phi-3.gguf 709KB
llama.cpp-master/models/ggml-vocab-phi-3.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-phi-3.gguf.out 2.62KB
llama.cpp-master/models/ggml-vocab-qwen2.gguf 5.65MB
llama.cpp-master/models/ggml-vocab-qwen2.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-qwen2.gguf.out 1.72KB
llama.cpp-master/models/ggml-vocab-refact.gguf 1.64MB
llama.cpp-master/models/ggml-vocab-refact.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-refact.gguf.out 1.87KB
llama.cpp-master/models/ggml-vocab-starcoder.gguf 1.64MB
llama.cpp-master/models/ggml-vocab-starcoder.gguf.inp 1.9KB
llama.cpp-master/models/ggml-vocab-starcoder.gguf.out 1.88KB
llama.cpp-master/mypy.ini 163B
llama.cpp-master/pocs/
llama.cpp-master/pocs/CMakeLists.txt 171B
llama.cpp-master/pocs/vdot/
llama.cpp-master/pocs/vdot/CMakeLists.txt 387B
llama.cpp-master/pocs/vdot/q8dot.cpp 5.23KB
llama.cpp-master/pocs/vdot/vdot.cpp 13.18KB
llama.cpp-master/poetry.lock 121.86KB
llama.cpp-master/prompts/
llama.cpp-master/prompts/LLM-questions.txt 2.54KB
llama.cpp-master/prompts/alpaca.txt 106B
llama.cpp-master/prompts/assistant.txt 2.29KB
llama.cpp-master/prompts/chat-with-baichuan.txt 90B
llama.cpp-master/prompts/chat-with-bob.txt 386B
llama.cpp-master/prompts/chat-with-qwen.txt 28B
llama.cpp-master/prompts/chat-with-vicuna-v0.txt 446B
llama.cpp-master/prompts/chat-with-vicuna-v1.txt 426B
llama.cpp-master/prompts/chat.txt 1.79KB
llama.cpp-master/prompts/dan-modified.txt 1.5KB
llama.cpp-master/prompts/dan.txt 1.62KB
llama.cpp-master/prompts/mnemonics.txt 4.97KB
llama.cpp-master/prompts/parallel-questions.txt 1.68KB
llama.cpp-master/prompts/reason-act.txt 758B
llama.cpp-master/pyproject.toml 1.25KB
llama.cpp-master/pyrightconfig.json 528B
llama.cpp-master/requirements.txt 505B
llama.cpp-master/requirements/
llama.cpp-master/requirements/requirements-all.txt 428B
llama.cpp-master/requirements/requirements-compare-llama-bench.txt 34B
llama.cpp-master/requirements/requirements-convert_hf_to_gguf.txt 111B
llama.cpp-master/requirements/requirements-convert_hf_to_gguf_update.txt 111B
llama.cpp-master/requirements/requirements-convert_legacy_llama.txt 99B
llama.cpp-master/requirements/requirements-convert_llama_ggml_to_gguf.txt 43B
llama.cpp-master/requirements/requirements-convert_lora_to_gguf.txt 96B
llama.cpp-master/requirements/requirements-pydantic.txt 48B
llama.cpp-master/requirements/requirements-test-tokenizer-random.txt 13B
llama.cpp-master/scripts/
llama.cpp-master/scripts/build-info.sh 717B
llama.cpp-master/scripts/check-requirements.sh 4.34KB
llama.cpp-master/scripts/ci-run.sh 1.28KB
llama.cpp-master/scripts/compare-commits.sh 749B
llama.cpp-master/scripts/compare-llama-bench.py 14.3KB
llama.cpp-master/scripts/debug-test.sh 5.01KB
llama.cpp-master/scripts/gen-authors.sh 337B
llama.cpp-master/scripts/gen-unicode-data.py 6.28KB
llama.cpp-master/scripts/get-flags.mk 1.27KB
llama.cpp-master/scripts/get-hellaswag.sh 263B
llama.cpp-master/scripts/get-pg.sh 1.36KB
llama.cpp-master/scripts/get-wikitext-103.sh 210B
llama.cpp-master/scripts/get-wikitext-2.sh 253B
llama.cpp-master/scripts/get-winogrande.sh 292B
llama.cpp-master/scripts/hf.sh 2.26KB
llama.cpp-master/scripts/install-oneapi.bat 802B
llama.cpp-master/scripts/pod-llama.sh 8.17KB
llama.cpp-master/scripts/qnt-all.sh 558B
llama.cpp-master/scripts/run-all-perf.sh 549B
llama.cpp-master/scripts/run-all-ppl.sh 554B
llama.cpp-master/scripts/run-with-preset.py 5.47KB
llama.cpp-master/scripts/server-llm.sh 11.22KB
llama.cpp-master/scripts/sync-ggml-am.sh 7.89KB
llama.cpp-master/scripts/sync-ggml.last 41B
llama.cpp-master/scripts/sync-ggml.sh 2.58KB
llama.cpp-master/scripts/verify-checksum-models.py 2.42KB
llama.cpp-master/scripts/xxd.cmake 647B
llama.cpp-master/spm-headers/
llama.cpp-master/spm-headers/ggml-alloc.h 28B
llama.cpp-master/spm-headers/ggml-backend.h 30B
llama.cpp-master/spm-headers/ggml-metal.h 28B
llama.cpp-master/spm-headers/ggml.h 22B
llama.cpp-master/spm-headers/llama.h 18B
llama.cpp-master/src/
llama.cpp-master/src/CMakeLists.txt 749B
llama.cpp-master/src/llama-grammar.cpp 19.4KB
llama.cpp-master/src/llama-grammar.h 1.09KB
llama.cpp-master/src/llama-impl.h 795B
llama.cpp-master/src/llama-sampling.cpp 22.09KB
llama.cpp-master/src/llama-sampling.h 2.63KB
llama.cpp-master/src/llama-vocab.cpp 66.57KB
llama.cpp-master/src/llama-vocab.h 4.64KB
llama.cpp-master/src/llama.cpp 789.47KB
llama.cpp-master/src/unicode-data.cpp 164.26KB
llama.cpp-master/src/unicode-data.h 582B
llama.cpp-master/src/unicode.cpp 29.9KB
llama.cpp-master/src/unicode.h 2.14KB
llama.cpp-master/tests/
llama.cpp-master/tests/.gitignore 25B
llama.cpp-master/tests/CMakeLists.txt 7.07KB
llama.cpp-master/tests/get-model.cpp 594B
llama.cpp-master/tests/get-model.h 53B
llama.cpp-master/tests/run-json-schema-to-grammar.mjs 395B
llama.cpp-master/tests/test-autorelease.cpp 719B
llama.cpp-master/tests/test-backend-ops.cpp 91.17KB
llama.cpp-master/tests/test-c.c 96B
llama.cpp-master/tests/test-chat-template.cpp 19.85KB
llama.cpp-master/tests/test-double-float.cpp 1.79KB
llama.cpp-master/tests/test-grad0.cpp 52.65KB
llama.cpp-master/tests/test-grammar-integration.cpp 35.63KB
llama.cpp-master/tests/test-grammar-parser.cpp 16.37KB
llama.cpp-master/tests/test-json-schema-to-grammar.cpp 38.9KB
llama.cpp-master/tests/test-llama-grammar.cpp 11.04KB
llama.cpp-master/tests/test-model-load-cancel.cpp 763B
llama.cpp-master/tests/test-opt.cpp 5.04KB
llama.cpp-master/tests/test-quantize-fns.cpp 6.54KB
llama.cpp-master/tests/test-quantize-perf.cpp 13.69KB
llama.cpp-master/tests/test-rope.cpp 6.12KB
llama.cpp-master/tests/test-sampling.cpp 13.34KB
llama.cpp-master/tests/test-tokenizer-0.cpp 10.46KB
llama.cpp-master/tests/test-tokenizer-0.py 1.92KB
llama.cpp-master/tests/test-tokenizer-0.sh 921B
llama.cpp-master/tests/test-tokenizer-1-bpe.cpp 4.68KB
llama.cpp-master/tests/test-tokenizer-1-spm.cpp 3.48KB
llama.cpp-master/tests/test-tokenizer-random.py 21.46KB