- [k-NN Plugin](#k-nn-plugin)
  - [Basic Approximate k-NN](#basic-approximate-k-nn)
    - [Create an Index](#create-an-index)
    - [Index Vectors](#index-vectors)
    - [Search for Nearest Neighbors](#search-for-nearest-neighbors)
  - [Approximate k-NN with a Boolean Filter](#approximate-k-nn-with-a-boolean_filter)
  - [Approximate k-NN with an Efficient Filter](#approximate-k-nn-with-an-efficient-filter)

# k-NN Plugin

Short for k-nearest neighbors, the k-NN plugin enables users to search for the k-nearest neighbors to a query point across an index of vectors. See [documentation](https://opensearch.org/docs/latest/search-plugins/knn/index/) for more information.

## Basic Approximate k-NN

In the following example we create a 5-dimensional k-NN index with random data. You can find a synchronous version of this working sample in [samples/knn/knn_basics.py](../../samples/knn/knn_basics.py) and an asynchronous one in [samples/knn/knn_async_basics.py](../../samples/knn/knn_async_basics.py).

```bash
$ poetry run python knn/knn_basics.py

Searching for [0.61, 0.05, 0.16, 0.75, 0.49] ...
{'_index': 'my-index', '_id': '3', '_score': 0.9252405, '_source': {'values': [0.64, 0.3, 0.27, 0.68, 0.51]}}
{'_index': 'my-index', '_id': '4', '_score': 0.802375, '_source': {'values': [0.49, 0.39, 0.21, 0.42, 0.42]}}
{'_index': 'my-index', '_id': '8', '_score': 0.7826564, '_source': {'values': [0.33, 0.33, 0.42, 0.97, 0.56]}}
```

### Create an Index

```python
dimensions = 5
client.indices.create(index_name, 
    body={
        "settings":{
            "index.knn": True
        },
        "mappings":{
            "properties": {
                "values": {
                    "type": "knn_vector", 
                    "dimension": dimensions
                },
            }
        }
    }
)
```

### Index Vectors

Create 10 random vectors and insert them using the bulk API.

```python
vectors = []
for i in range(10):
    vec = []
    for j in range(dimensions): 
        vec.append(round(random.uniform(0, 1), 2)) 
  
    vectors.append({
        "_index": index_name,
        "_id": i,
        "values": vec,
    })

helpers.bulk(client, vectors)

client.indices.refresh(index=index_name)
```

### Search for Nearest Neighbors

Create a random vector of the same size and search for its nearest neighbors.

```python
vec = []
for j in range(dimensions): 
    vec.append(round(random.uniform(0, 1), 2)) 

search_query = {
    "query": {
        "knn": {
            "values": {
                "vector": vec, 
                "k": 3
            }
        }
    }
}

results = client.search(index=index_name, body=search_query)
for hit in results["hits"]["hits"]:
    print(hit)
```

## Approximate k-NN with a Boolean Filter

In [the knn_boolean_filter.py sample](../../samples/knn/knn_boolean_filter.py) we create a 5-dimensional k-NN index with random data and a `metadata` field that contains a book genre (e.g. `fiction`). The search query is a k-NN search filtered by genre. The filter clause is outside the k-NN query clause and is applied after the k-NN search.

```bash
$ poetry run python knn/knn_boolean_filter.py 

Searching for [0.08, 0.42, 0.04, 0.76, 0.41] with the 'romance' genre ...

{'_index': 'my-index', '_id': '445', '_score': 0.95886475, '_source': {'values': [0.2, 0.54, 0.08, 0.87, 0.43], 'metadata': {'genre': 'romance'}}}
{'_index': 'my-index', '_id': '2816', '_score': 0.95256233, '_source': {'values': [0.22, 0.36, 0.01, 0.75, 0.57], 'metadata': {'genre': 'romance'}}}
```

## Approximate k-NN with an Efficient Filter

In [the knn_efficient_filter.py sample](../../samples/knn/knn_efficient_filter.py) we implement the example in [the k-NN documentation](https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/), which creates an index that uses the Lucene engine and HNSW as the method in the mapping, containing hotel location and parking data, then search for the top three hotels near the location with the coordinates `[5, 4]` that are rated between 8 and 10, inclusive, and provide parking.

```bash
$ poetry run python knn/knn_efficient_filter.py

{'_index': 'hotels-index', '_id': '3', '_score': 0.72992706, '_source': {'location': [4.9, 3.4], 'parking': 'true', 'rating': 9}}
{'_index': 'hotels-index', '_id': '6', '_score': 0.3012048, '_source': {'location': [6.4, 3.4], 'parking': 'true', 'rating': 9}}
{'_index': 'hotels-index', '_id': '5', '_score': 0.24154587, '_source': {'location': [3.3, 4.5], 'parking': 'true', 'rating': 8}}
```