What are vectors and vector databases? Let me explain it to you by choosing a computer and writing code!

Written by
Iris Vance
Updated on:June-18th-2025
Recommendation

Use computer selection and programming examples to easily introduce you to the world of vectors and vector databases!

Core content:
1. Basic concepts and characteristics of vectors
2. Application scenarios of vectors in the AI ​​era
3. Methods and technologies for implementing vectorization

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

Introduction to Vectors

More powerful than Excel

“Data fingerprint”


Imagine you walk into a computer mall and see that every laptop has a parameter card like this: 

CPU: i7-13700H Graphics card: RTX 4060 Weight: 1.87kg Price: 8999 yuan

If we concatenate these values, we get a 4-dimensional vector:

[13700, 4060, 1.87, 8999]  


This is a vector : using a string of numbers to accurately describe the characteristics of an object, just like generating a unique ID card for each object .  


A vector is a quantity that has both magnitude and direction , and is often represented in computers as a digital array [v1, v2, ..., vn].


As the "memory center" of the AI ​​era, vector databases are reshaping the boundaries of artificial intelligence applications by efficiently processing unstructured data.





Three characteristics of vectors


1. Dimensional freedom


The computer parameter vector may be 10-dimensional: [CPU, memory, hard disk, graphics card, screen size, weight, price...]


The programmer code vector may be 100-dimensional: [function length, number of loops, API calls, error types...]


2. Computability


Compare similarities through mathematical calculations:

# Calculate the similarity of two laptops (the closer the value is to 1, the more similar they are) cos_sim( [i7, 4060, 1.8kg, 8999], # Laptop A [i9, 4080, 2.1kg, 12999] # Laptop B ) = 0.76


3. Semantic magic


AI can turn text into vectors, allowing computers to understand semantics:

"laptop" → [0.23, 0.76, -0.12,...] "portable computer" → [0.25, 0.74, -0.09,...]


The similarity between these two vectors is as high as 0.98 ! Wow, no more manual comparison and selection of computers.






Industry Applications of Vectors


Scenario 1: Accurate recommendation for computer sales


When a customer says, "I want a thin and light office notebook," the system:


  1. Convert the requirements to vectors → [ CPU weight: 0.3, graphics card weight: 0.1, weight weight: 0.6... ]


  2. Search the stock vector library for the closest → match to HUAWEI MateBook X Pro 2024


  3. Generate recommendation words: "This model weighs only 1.26kg and has a battery life of 18 hours, making it particularly suitable for mobile office work."  


Scenario 2: Programmer code reuse


When you need to implement the "form validation" function:


  1. Convert requirements into code feature vectors → [ validation function: 1, regular expression: 0.8, error handling: 0.7... ]


  2. In the code vector library matching → find validateForm.ts in the 2023 project


  3. Automatic prompt: "Refer to the TypeScript validation tool class on lines 45-78" 





How to generate vectors


1. Direct conversion of numeric values


Computer price: 8999 → Directly as one-dimensional vector

Lines of code: 128 → Directly as one-dimensional vector  


2. Text vectorization


Using AI models (such as BERT):

from sentence_transformers import SentenceTransformer model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2') # Convert text to 768-dimensional vector text = "RTX 4080 supports DLSS 3.0 technology" vector = model.encode(text) # Output is like [0.23, -0.45, 0.17,...]


3. Image/speech vectorization


Image: Extracting feature vectors using the ResNet model

Speech: Convert to text using the Whisper model and then vectorize  




Domain positioning of vector database


1. Technical pedigree


Database technology : belongs to the branch of unstructured database , on par with relational database


AI infrastructure : As a new storage engine in the era of big models, it supports semantic understanding capabilities


2. Technology comparison


Database Type
Core Competencies
Typical Representatives
Relational Database
Precise query of structured data
MySQL,Oracle
Document Database
Flexible storage in JSON format
MongoDB
Vector Database
High-dimensional data similarity retrieval
Milvus,Pinecone


3. Technology stack position


 Database technology stack




Why do vectors improve efficiency?


1. Traditional method


Computer sales : Manually flipping through parameter tables for comparison


Programmers : Global search for code keywords


2. Vectorized method


Semantic understanding : Knowing that "thin and light notebook" ≈ "portable notebook" ≈ "ultrabook"


Fuzzy matching : even if the parameters are not exactly the same, you can find the closest option


Cross-modal search : search for images using text, search for code using error logs  




 Knowledge Map


Theoretical basis: Linear algebra → Vector space theory → Machine learning embedding technology


Engineering practice: Database principles → Approximate nearest neighbor algorithm → Vector database architecture


Application extension: Recommendation system/semantic search → RAG architecture → Enterprise knowledge management


The development of vector technology




? Hands-on experiment


Quickly experience vector magic with Python: 

import numpy as np # Define vectors of two notebooks notebook_A = np.array([8, 16, 512, 1.87]) # [CPU cores, memory GB, hard disk GB, weight kg] notebook_B = np.array([12, 32, 1024, 2.15]) # Calculate Euclidean distance (the smaller the value, the more similar) distance = np.linalg.norm(notebook_A - notebook_B) print(f"The difference index of these two notebooks: {distance:.2f}")