Getting Responses from Local LLM Models with Python
type
status
date
slug
summary
tags
category
icon
Introduction
After setting up your local LLM model with LM Studio (as covered in my previous article), the next step is to interact with it programmatically using Python. This article will show you how to create a simple yet powerful Python interface for your local LLM.
Step 1: Start Your Local LLM System
Before running the Python code, ensure your local LLM system is up and running. Most systems expose a RESTful API or a similar interface for interaction.
For instance, LM Studio or similar tools may provide a local endpoint. You can find your local server address and Supported endpoints in the LM Studio interface.

As you can see, the local server address is
http://127.0.0.1:1234
.
the Supported endpoints are:Step 2: List Available Models
The /v1/models endpoint retrieves the list of available models.
Here’s a basic Python script to send a prompt to your local LLM and receive a response.
This will display the models hosted by LM Studio.
Step 3: Get Response from LLM
/v1/completions is for single prompts, while /v1/chat/completions is for conversations with context.
1. Generate a Completion
Use the /v1/completions endpoint to send a prompt and receive a response.
This endpoint generates a response to a simple text prompt. It’s straightforward and doesn’t involve a conversation context. Use this when you need a single, standalone output based on your input.
Example:
Key Parameters:
• model: Specify the model to use.
• prompt: Your input to the model.
• max_tokens: Controls the maximum length of the response.
• temperature: Adjusts the randomness of the output.
2. Use the Chat Completion Endpoint
The /v1/chat/completions endpoint is ideal for conversation-like interactions.
This endpoint is for managing multi-turn conversations. It keeps track of context using roles (system, user, assistant). Use this when you need interactive, dynamic conversations with the model.
Example:
When to Use:
• Interactive tasks: Provide back-and-forth dialogue with context awareness.
• Multi-step queries: Answer questions with follow-ups, maintaining the conversation thread.
• Context-sensitive tasks: Adjust responses based on prior inputs or the user’s context.
Conclusion
By following this guide, you can use Python to interact with your local LLM model. This is a simple and powerful way to integrate LLM into your applications.
Feel free to expand these scripts for more complex applications, such as automation or integration with other tools!
上一篇
Solving URL Path Accumulation in NotionNext Multilingual Menus
下一篇
Local LLM Models and Game Changing Use Cases for Life Hackers
- Author:Luca Liu
- URL:http://www.blog.luca-liu.com/article/getting-responses-from-local-llm-models-with-python
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!