> For the complete documentation index, see [llms.txt](https://grigore-mihaela.gitbook.io/machine-learning/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://grigore-mihaela.gitbook.io/machine-learning/notes/python/ndarray-slicing-with-index-out-of-bounds.md).

# ndarray slicing with index out of bounds

While training a GAN network, I accidentally set the loop iterating through batches of training data to stop at a value way beyond the length of the training set. More exactly, I did this:

&#x20;&#x20;

```
# load MNIST dataset and make necessary preprocessing steps
X_real, y_real = get_real_data()

# now we split into train and test
X_real_train, X_real_test, y_real_train, y_real_test = train_test_split(X_real, y_real, test_size=0.15)

n_samples = len(X_real) * 2 # we add an equal amount of fake data (made by the generator of the GAN)
```

Notice that insead of `len(X_real_train)` I accidentally wrote `len(X_real)`&#x20;

```
for e in tqdm(range(epochs)):
    for b in tqdm(range(n_samples // batch_size)):
        ## train Discriminator
        
        # get a batch of real images
        start = b * half_batch
        end = (b + 1) * half_batch
        
        X_real_b = X_real_train[start : end]
        y_real_b = y_real_train[start : end]
        
        # combine the real and fake images
        X = np.vstack((X_real_b, X_fake_b))
        y = np.vstack((y_real_b, y_fake_b))
        
        # run a single gradient update on a single batch of data (half real, half fake)
        # update weights and store discriminator prediction loss
        disc_loss, _ = disc_model.train_on_batch(X, y)
        .....
```

During each batch iteration I extracted one slice from the `X_real_train` , and fed it to a classifier for training.&#x20;

Training seemed to go well for the first hundreds of batches. I had sanity check after each batch to see how the metrics of the model changed.&#x20;

But since `print` slows down training on a GPU so much, I removed this sanity check and used a print every ten epochs (1400 batches) instead of each batch. &#x20;

And after the first epoch I noticed that the classifier (disc\_model) would stabilize at \*always\* predicting 0 and never got out of that zone for the rest of the training.

I couldn't figure out why. I went over the architecture of the generator, of the discriminator etc.

I finally got to the training loop, I saw the X\_real mistake and I couldn't figure out how I did not get an error.

It turns out this is how ndarray works:&#x20;

![](/files/vboNo5YOdzKT4jlpT6zQ)

The slicing doesn’t raise an error if both the start and stop indices are larger than the sequence length. This is in contrast to simple indexing: when indexing an element that is out of bounds, Python will throw an[ index out of bounds error](https://blog.finxter.com/python-indexerror-list-index-out-of-range/). However, with slicing it simply returns an empty sequence.

Because my train data was 59500 instead of 70000 (train+test), my batch iteration was not stopping at 1190, but at 1400.&#x20;

From 1190, I was going into index out of bounds territory.

But since ndarray would not throw an error but instead return an empy array, I didn't notice.&#x20;

Which means that for iterations 1190 to 1400 I was training a classifier on an empty array of real data (label 1) and a batch of 100 fake sample (label 0).&#x20;

And in these 210 iterations, my classifier learned to predict always 0. And next epoch, we would start over. It would learn to predict correctly, then towards the ent of the epoch I would have again 210 iterations of training on an empty array of real data (label 1) and a batch of 100 fake sample (label 0).&#x20;


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://grigore-mihaela.gitbook.io/machine-learning/notes/python/ndarray-slicing-with-index-out-of-bounds.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
