Article List

Explore latest news, discover interesting content, and dive deep into topics that interest you

Clear Filters
Synthetic Data for LLM Training ML MLOps

Synthetic Data for LLM Training

Training foundation models at scale is constrained by data. Whether working with text, code, images, or multimodal inputs, the public datasets are sat...

2 months ago Blog - nept…
32732 words 109 min
What are LLM Embeddings: All you Need to Know ML MLOps

What are LLM Embeddings: All you Need to Know

Embeddings are a numerical representation of text. They are fundamental to the transformer architecture and, thus, all Large Language Models (LLMs). I...

2 months, 1 week ago Blog - nept…
36786 words 122 min
Detecting and Fixing ‘Dead Neurons’ in Foundation Models ML MLOps

Detecting and Fixing ‘Dead Neurons’ in Foundation Models

In neural networks, some neurons end up outputting near-zero activations across all inputs. These so-called “dead neurons” degrade model capacity beca...

2 months, 2 weeks ago Blog - nept…
30905 words 103 min
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training ML MLOps

Part 2: Instruction Fine-Tuning: Evaluation and Advanced Te…

In the first part of this series, we covered the fundamentals of instruction fine-tuning (IFT). We discussed how training LLMs on prompt-response pair...

2 months, 3 weeks ago Blog - nept…
49624 words 165 min
How to Optimize LLM Inference ML MLOps

How to Optimize LLM Inference

Large Language Model (LLM) inference at scale is challenging as it involves transferring massive amounts of model parameters and data and performing c...

3 months ago Blog - nept…
46748 words 155 min
A Researcher’s Guide to LLM Grounding ML MLOps

A Researcher’s Guide to LLM Grounding

Large Language Models (LLMs) can be thought of as knowledge bases. During training, LLMs observe large amounts of text. Through this process, they enc...

3 months, 2 weeks ago Blog - nept…
22460 words 74 min
Part 1: Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions ML MLOps

Part 1: Instruction Fine-Tuning: Fundamentals, Architecture…

Instruction Fine-Tuning (IFT) emerged to address a fundamental gap in Large Language Models (LLMs): aligning next-token prediction with tasks that dem...

3 months, 4 weeks ago Blog - nept…
55663 words 185 min
Understanding Prompt Injection: Risks, Methods, and Defense Measures ML MLOps

Understanding Prompt Injection: Risks, Methods, and Defense…

Here’s something fun to start with: Open ChatGPT and type, “Use all the data you have about me and roast me. Don’t hold back.”...

5 months, 1 week ago Blog - nept…
43538 words 145 min
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections] ML MLOps

SabiYarn: Advancing Low-Resource Languages With Multitask N…

In recent years, Large Language Models (LLMs) have mostly improved by scaling. This has primarily involved increasing the size of the LLMs and the dat...

5 months, 2 weeks ago Blog - nept…
16197 words 53 min
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models ML MLOps

How to Monitor, Diagnose, and Solve Gradient Issues in Foun…

As foundation models scale to billions or even trillions of parameters, they often exhibit training instabilities, particularly vanishing and explodin...

6 months, 1 week ago Blog - nept…
38297 words 127 min
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection] ML MLOps

STUN: Structured-Then-Unstructured Pruning for Scalable MoE…

Mixture-of-Experts (MoEs) architectures offer a promising solution by sparsely activating specific parts of the model, reducing the inference overhead...

7 months, 1 week ago Blog - nept…
10847 words 36 min
Evaluating RAG Pipelines ML MLOps

Evaluating RAG Pipelines

Retrieval-augmented generation (RAG) is a technique for augmenting the generative capabilities of a large language model (LLM) by integrating it with...

8 months ago Blog - nept…
63807 words 212 min