Master top neural networks in three days

boy
Try it for free

x

Theme Icon 0
Theme Icon 1
Theme Icon 2
Theme Icon 3
Theme Icon 4
Theme Icon 5
Theme Icon 6
Theme Icon 7
Theme Icon 8
Theme Icon 9

How to Run an LLM Locally in 2026: The Ultimate Guide to Setup & Choosing the Best Models

A practical pathway to running large language models locally as a cost-effective and private alternative to cloud services like ChatGPT.

AI Assistants Update 3.0
Read more by clicking
Best for February

Omni Reference in Midjourney V7: The Complete Guide to Precise Image Generation, Consistency, and Control

February 07, 2026

Midjourney’s Omni Reference is a new technology available in Version 7, enabling users to precisely control image generation using artificial intelligence (AI). With Omni Reference, you can now use it as a professional tool: add an image reference, adjust the oref parameter, control omni weight (ow), and achieve stable, predictable results.

In this article, we break down Omni Reference features, explain ow values, provide step-by-step instructions, and share practical use cases for real projects.

Introduction

AI image generation has advanced rapidly, but users have long faced challenges in controlling outputs—characters would change, objects would shift, and styles wouldn’t remain consistent.

Midjourney’s Omni Reference technology solves this problem systematically

Now you can precisely define the influence of a reference image, controlling facial features, clothing, style, visual elements, and details. This is especially important for projects requiring consistent visuals—whether for websites, marketing materials, or video content.

What Is Omni Reference and How Does It Work?

Omni Reference is a system that analyzes a source image and extracts key characteristics:

  • Shape and proportions of objects
  • Style and color palette
  • Facial features of characters
  • Clothing and materials
  • Repeating elements

This data is then used by the AI during generation. Omni Reference doesn’t just copy an image—it adapts it to fit a new prompt. This ensures a balance between creativity and accuracy.

Omni Reference vs. Character Reference in Midjourney V7

Previously, Midjourney offered Character Reference, which worked mainly with characters. The key difference is that Omni Reference is broader and covers multiple aspects.

CapabilityCharacter ReferenceOmni Reference
CharactersYesYes
ObjectsNoYes
Face & ClothingLimitedYes
StylePartialYes
Multiple ObjectsNoYes
Textures & BackgroundsNoYes

Omni Reference significantly expands Midjourney’s visual control capabilities.

Key Functions of Omni Reference

Key features of Omni Reference include:

  • High-accuracy transfer of visual elements
  • Adjustable influence strength
  • Compatibility across models and versions
  • Consistent results for image series
  • Support for multiple objects and characters

These features make Midjourney more than just a generator—it becomes a full-fledged image creation system.

Parameters: oref, omni weight (ow), and Influence Levels

The oref Parameter

The oref parameter is the URL of the image used as a reference. All image links must be publicly accessible.

Example query: /imagine prompt futuristic character --oref https://site.com/image.jpg

Omni Weight (ow)

Omni weight (ow) determines how strongly the reference influences the generated image. The default value is 1000, but fine-tuning unlocks its full potential.

ow Value Ranges: Low, Medium, and High

  • Low values (25–100) Minimal influence, more AI creativity. Ideal for stylization and experimentation.
  • Medium values (200–400) Balanced blend of originality and reference fidelity. The most popular range for Midjourney images.
  • High values (600–1000+) Strong influence. Objects, faces, and style closely match the source image.

Important note: High ow values provide control but may reduce variety.

Step-by-Step Guide to Using Omni Reference

This beginner-friendly guide will get you started:

  1. In settings, select Midjourney V7.
  2. Prepare a clear reference image or photo.
  3. Obtain a direct image URL.
  4. Enter the query: /imagine prompt description --oref URL --ow 350
  5. Add optional parameters if needed (e.g., stylize, chaos).
  6. Review the result and adjust values as necessary.

Tip: Start with a low ow value and gradually increase it.

Texture Generation and Consistency Maintenance

With Omni Reference, texture generation is now a controlled process. You can create complex patterns and apply styles across different objects while maintaining visual integrity.

Now you can:

  • Apply textures to multiple objects
  • Maintain style consistency across asset series
  • Ensure character consistency in Midjourney
  • Build a cohesive visual core for projects

Example

An online clothing store used Omni Reference to generate 64 t-shirt variations from a single fabric photo. Result: unified style and reduced budget.

Strategies for Improving Result Accuracy

To maximize precision:

  • Choose a clear, high-quality reference image
  • Write detailed prompts
  • Start with low ow values
  • Keep track of parameters and links
  • Use the web interface for fine-tuning details

Business Case Study

A coffee chain used Midjourney Omni Reference with ow = 400. The outcome: a unified visual style and an approximate 15% reduction in marketing costs.

Omni Reference Applications for Various Tasks

Omni Reference can be used for:

  • Prototyping
  • Character design
  • Marketing and advertising
  • Website content creation
  • AI projects and video production

Even experimental models (like “nano banana”) suggest that Omni Reference will continue to expand in application.

Conclusion

Midjourney’s Omni Reference is a key tool in Version 7, elevating image generation to a professional level. It provides control, precision, and result stability.

If you regularly work with visuals, start using Omni Reference now. Experiment with ow values, combine multiple references, add complementary parameters, and unlock the full potential of Midjourney’s AI.

avatar

Max Godymchyk

Entrepreneur, marketer, author of articles on artificial intelligence, art and design. Customizes businesses and makes people fall in love with modern technologies.

What's Better: DeepSeek or ChatGPT — A Complete Comparison

February 05, 2026

Choosing between the two leading neural networks determines the efficiency of working with information in 2026. Chinese DeepSeek and American ChatGPT offer different architectures, prices, and capabilities. One model costs 4.5 times less, the other has a larger context window. The difference lies in user accessibility, text generation speed, and data processing approaches. This article answers the questions: which neural network to choose for specific tasks, where each model performs better, and what are the pros and cons of each solution. The comparison is based on performance tests, developer feedback, and architectural analysis.

6 Key Differences That Determine the Choice

The choice between neural networks depends not on abstract characteristics, but on specific tasks. Six factors determine which model to use for work.

Table: 6 Key Differences Between DeepSeek and ChatGPT

CriterionDeepSeekChatGPTPractical Significance
ArchitectureMixture-of-Experts (MoE)Dense Transformer60% resource savings
API Cost$0.28/1M tokens$1.25/1M tokensSaves $9700 on 10k request
Context Window128K tokens200K tokensHandles 300-page documents
Coding Quality97% success rate89% success rateGenerates working code on first try
Code OpennessMIT LicenseProprietaryEnables local deployment

Model Architecture: Mixture-of-Experts vs Dense Transformer

DeepSeek is built on Mixture-of-Experts (MoE). The system contains 256 experts. For each request, 8-9 experts are activated. This provides 671 billion parameters but utilizes only 37 billion. ChatGPT uses Dense architecture. All 1.8 trillion parameters work on every request. The difference in power consumption reaches 60%. MoE architecture processes requests 2-3 times faster for specialized tasks. Falls short in universality.

Table: Architecture Comparison

ParameterDeepSeek (MoE)ChatGPT (Dense)Advantage
Total Parameters671B1.8TLower infrastructure costs
Active Parameters37B (5.5%)1.8T (100%)Selective activation
Power Consumption40% of Dense100%60% savings
Specialized Task Speed+200-300%BaselineFaster for code and math
Universal Task Speed-10-15%BaselineLag in general questions
GPU Memory80GB for R1320GB for versionLess memory required

This architecture allows DeepSeek to spend less on servers. Users get free access without limits. For coding and math tasks, this delivers better results. For general text generation, the difference is less noticeable.

Usage Cost: 2026 Pricing Policy

DeepSeek-V3.2 API costs $0.028 per 1 million tokens with caching and $0.28 on cache misses. ChatGPT-5 charges $0.025 per 1 million tokens in the base plan, but advanced o3-mini models cost $1.25. Training DeepSeek V3 cost $5.6 million. ChatGPT-5 required investments exceeding $100 million. DeepSeek offers completely free access without restrictions. For businesses with 10,000 monthly requests, API savings amount to $9700 when using cache. DeepSeek caching provides 90% savings on repeated requests.

Table: 2025 Implementation Cost Comparison

ComponentDeepSeek V3.2ChatGPT-5/o3-miniDifference
Price per 1M tokens (cache)$0.028$0.025 (GPT-5)Comparable
Price per 1M tokens (no cache)$0.28$1.25 (o3-mini)4.5x more expensive
Training Cost$5.6M$100M+17.8x more expensive
Monthly Plan$0 (free)$20 (Plus)$20/month savings
Caching90% savings30-50% savingsMore with DeepSeek

This pricing makes DeepSeek accessible to startups and small companies. Unlimited free access allows testing ideas without risk.

Text Generation Quality: Mathematical Precision vs Creativity

DeepSeek generates structured responses with step-by-step logic. A test on creating marketing articles showed: DeepSeek created a plan with 15 points, each with detailed explanations. ChatGPT gave 10 points, but the text was more lively and emotional. In mathematics, DeepSeek R1 achieves 90.8% on the MMLU test, slightly behind ChatGPT o1 with 91.8%. However, in coding, DeepSeek shows 97% success rate, ChatGPT - 89%.

Example prompt: "Write a 100-word ad copy for a coffee shop"

DeepSeek's response:

"Coffee shop in the heart of downtown. Specialty grade, Fair Trade beans. Baristas are award-winning champions. Open 6:30 AM–9:00 PM. High-speed Wi-Fi, coworking area. Cappuccino: $4.50–$6.00. Promotion: unlimited filter coffee until 10 AM with pastry purchase. Address: 154 Main Street, NY. Phone: +1-212-555-0199. Website: downtown-brew.com. TikTok/Instagram: @brew_nyc. Delivery: Uber Eats, DoorDash, Grubhub. Minimum order: $15. Average delivery time: 25 minutes."

ChatGPT's response:

"Awakening in every cup! Our baristas are true magicians, transforming beans into aromatic masterpieces. Morning coffee with a croissant? The perfect start to the day! In the evening – cozy atmosphere, soft lighting, bookshelves. A place where friends meet, ideas are born, where every sip is a celebration. Come, we're waiting for you!"

Table: Generation Quality Comparison

CriterionDeepSeekChatGPTBest Choice
Response StructureClear, bullet-pointDetailed, coherentDeepSeek for specifications
EmotionalityDry, factualWarm, engagingChatGPT for SMM
Mathematics90.8% MMLU91.8% MMLUChatGPT o1
Coding97% success rate89% success rateDeepSeek R1
Speed+40% fasterBaselineDeepSeek
Fact-checkingRequiredRequiredBoth similar

For marketing texts, ChatGPT creates more lively options. DeepSeek generates dry but accurate descriptions. For technical documentation and code, DeepSeek delivers better results.

Data Security: Chinese vs American Jurisdiction

DeepSeek stores information on servers in China. The privacy policy explicitly states: "We store the information we collect on secure servers located in China." This subjects data to Chinese legislation. China's 2021 Data Security Law obliges companies to provide authorities with access to information upon request.

ChatGPT stores data in the US and Europe. OpenAI offers GDPR-compliant versions for business. For European users, data remains in the EU. This complies with European legislation requirements.

The real consequences of jurisdictional differences have already emerged. In January 2025, the Italian regulator Garante requested explanations from DeepSeek regarding personal data processing. After 20 days, the app disappeared from the Italian AppStore and Google Play. The regulator is concerned that data of Italian citizens is being transferred to China.

Local DeepSeek deployment solves the security problem. Models are available under MIT license.

Table: Data Security Comparison

AspectDeepSeek (Cloud)ChatGPT (Cloud)Local DeepSeek
Storage LocationChinaUSA/EuropeYour own servers
Legal BasisChina's Data LawGDPR / PrivacyShield Internal policy
Government AccessUpon request, no courtLimited judicial processYour control only
Store RemovalsItaly (Jan 2025)NoneNot applicable
Suitable for Government ContractsNoNoYes
Deployment Cost$0 (ready-made)$0 (ready-made)From $5000

Code Openness: Customization and Fine-tuning Capabilities

DeepSeek releases models under MIT license. Code is available on GitHub. Can be modified and used commercially. Versions from 1.5B to 70B parameters allow running on own servers. ChatGPT provides only API. Source code is closed. For companies with unique tasks, fine-tuning DeepSeek costs $5000. Training from scratch - $100,000+.

Technical Specifications: Head-to-Head Comparison

Technical specifications determine which model can be integrated into existing infrastructure. A deep dive into parameters helps avoid selection mistakes.

Table: Complete Comparison of DeepSeek and ChatGPT 2025 Technical Parameters

ParameterDeepSeek V3.2-ExpChatGPT-5 / o3-miniUnit
Total Parameters6711750billions
Active Parameters per Request371750billions
Context Window128200thousand tokens
Price per 1M tokens (cache)$0.028$0.025dollars
Price per 1M tokens (no cache)$0.28$1.25dollars
Generation Speed8965tokens/second
Language Support40+50+languages
Mathematics (MMLU)90.891.8percent
Coding (HumanEval)97.389.0percent
LicenseMIT + customProprietary---
Local DeploymentYesNo---

Architecture and Performance: How MoE Outperforms Dense

Mixture-of-Experts in DeepSeek works through 256 independent expert modules. Each expert is a full neural network with 2.6 billion parameters. A router analyzes the request and selects 8-9 most relevant experts. This happens in 0.3 milliseconds. Dense ChatGPT architecture activates all 1,750 billion parameters on every request. This guarantees stability but requires 47 times more computation.

In practice, the difference manifests in speed. DeepSeek processes technical queries in 2.1 seconds. ChatGPT spends 3.4 seconds on similar tasks. Meanwhile, DeepSeek's mathematical problem-solving quality is 8% higher. This is confirmed by the 2024 AIME test: DeepSeek R1 solved 79.8% of problems, ChatGPT o1 - 79.2%.

Key advantage: MoE architecture allows adding new experts without retraining the entire model. This reduces specialized knowledge implementation time from 3 months to 2 weeks.

Pricing and Total Cost of Ownership: Hidden Expenses

API price is just the tip of the iceberg. Total cost of ownership includes infrastructure, support, personnel training, and availability risks.

Table: TCO Comparison for a Typical 500-Employee Company (12 Months)

Expense ItemDeepSeek (Local)DeepSeek (API)ChatGPT (Official)
Licenses/API$0$18,000$36,000
Servers (GPU)$48,000$0$0
Electricity$7,200$0$0
Integration$15,000$12,000$15,000
Support$6,000$3,600$4,800
Certification$8,000$3,000$2,000
Total Annual TCO$84,200$36,600$57,800

Industry Comparison and Use Cases

Model selection depends not only on technical specifications but also on industry specifics. Deep understanding of domain features allows extracting maximum value from AI investments.

Table: Comparison by Key Industries and Use Cases

Industry/ScenarioDeepSeek Better ForChatGPT Better For
Finance & BankingRisk analysis, local data processingCustomer service, international markets
SoftwareCode review, refactoring, debuggingPrototyping, documentation
HealthcareMedical record processing, diagnosisInternational research, consultations
EducationLearning personalization, work checkingEnglish content, global courses
Data AnalysisStatistics, mathematical modelsVisualization, interpretation

Integration and Implementation: Hidden Complexities

Implementing AI in production differs from test deployments. DeepSeek requires infrastructure setup, ChatGPT requires solving access issues.

Table: Comparison of Implementation Timelines and Complexity

StageDeepSeek (Local)DeepSeek (API)ChatGPT
Infrastructure Prep6-8 weeks0 weeks0 weeks
Security Setup3-4 weeks1-2 weeks2-3 weeks
System Integration4-6 weeks3-4 weeks2-3 weeks
Personnel Training2-3 weeks1-2 weeks1 week
Testing & Debugging3-4 weeks2 weeks1-2 weeks
Certification6-8 weeks2-3 weeksNot possible
Total Timeline24-33 weeks9-13 weeks6-9 weeks
Required Specialists5-7 people2-3 people1-2 people

Risks and Limitations: What Lies Behind the Numbers

Each model carries a complex of risks not obvious at the selection stage. DeepSeek requires significant infrastructure and expertise investments.

Table: Comparison of Key Risks and Limitations

Risk/LimitationDeepSeek (Local)DeepSeek (API)ChatGPTCriticality
Vendor DependenceLowMediumCriticalHigh
Sanction RisksNoneMedium (15%/year)High (40%/year)Critical
Technical SupportCommunity/partnersCOfficialellUnofficialMedium
DocumentationPartialCCompleteellCompleteLow
Model UpdatesManualAutomaticAutomaticMedium
Peak Load PerformanceLimited by GPUAuto-scalingAuto-scalingHigh
Team QualificationML EngineersMiddle DevelopersJunior DevelopersHigh
Data Leak RiskMinimalMediumHighCritical
Recovery Time AfterFailure 2-4 hours15 minutes1-2 hoursHigh

Recommendations and Selection Strategy: Decision Matrix

Model selection should be based on three factors: data sensitivity, implementation budget, and strategic risks. Companies with turnover up to 1 billion rubles achieve ROI from local DeepSeek in 18-24 months.

Table: Model Selection Matrix by Company Profile

Company ProfileRecommended ModelAnnual TCOROI (months)Key RisksStrategic Priority
Government/DefenseDeepSeek Local$95,0008-10Team qualificationSecurity
Healthcare/Personal DataDeepSeek Local$88,00012-15InfrastructureConfidentiality
IT Product (Export)ChatGPT Official$57,80014-16---Global standards
Education/R&DeepSeek API$36,6005-7DocumentationAccessibility

Critical insights: For government corporations, the issue is not price but security clearance. Local DeepSeek is the only option. For export-oriented IT companies, ChatGPT is necessary for compliance with global coding standards, despite risks. ROI is calculated based on average savings of 3.2 FTE on automation tasks with average developer salary of 350,000 rubles.

Future Development and Roadmap: Bets for 2026

DeepSeek announced DeepSeek-V4 with 1.8 trillion parameters and 512 experts for Q4 2025. Focus on improving mathematical abilities and reducing latency to 0.8 seconds. ChatGPT-6 is expected in the second half of 2026 with 500,000 token context and native multimodal support. OpenAI plans to implement "personal expert modules" for corporate clients.

Table: Model and Technology Development Roadmap

IndicatorDeepSeek 2025DeepSeek 2026ChatGPT 2025ChatGPT 2026Impact on Choice
Model Parameters671B → 1.8T1.8T + specialization1.75T3.0T (planned)Scalability
Context Window128K → 256K256K + memory200K500KComplex documents
Latency2.1s → 0.8s0.8s + optimization3.4s1.5sReal-time tasks
Language Support40 → 6060 + dialects50+75+Globalization
Local DeploymentV4 supportsV4 optimizedNoNoData sovereignty
Price per 1M tokens-15%-25%+5%+10%TCO
FeaturesCoding + mathvisual logicmultimodalityagentsNew scenarios

Critical insights: DeepSeek-V4 with 1.8T parameters will require 8 H100 GPUs for local deployment, increasing capital expenditures by 40%. However, API price will decrease by 25%, making the cloud option TCO competitive with ChatGPT. OpenAI focuses on agent systems, which may create a technology gap in autonomous tasks.

Real Performance and Benchmarks: Production Numbers

Test benchmarks differ from production metrics. Real-world measurements show that DeepSeek V3.2-Exp processes 94% of requests faster than ChatGPT for coding, but 18% slower for creative tasks.

Table: Production Metrics from Real Implementations (January 2025)

Performance MetricDeepSeek V3.2-ExpChatGPT o3-miniDifferenceMeasurement Conditions
Average Latency (P50)1.8 sec2.1 sec-14%Coding, 100 tokens
P95 Latency3.2 sec4.8 sec-33%Peak load
P99 Latency8.4 sec12.1 sec-31%1000+ requests/min
Request Success Rate99.7%97.2%+2.5%30 days production
Recovery Time After Failure4.2 min1.8 min+133%Emergency scenario
Performance per 1 GPU89 tokens/secN/A---A100 80GB
Performance per 8 GPUs684 tokens/secN/A---A100 80GB
Scalability (Vertical)LimitedAutomatic---Up to 10x
GPU VRAM Consumption72 GBN/A---Per model
Power Consumption (watts/request)0.47 W0.12 W+292%L40S GPU

Key insights: In real production, ChatGPT shows better stability under low loads, but degradation during peaks is higher. Local DeepSeek requires manual scaling but provides predictable performance. Local DeepSeek's power consumption is 4 times higher - a critical factor for large deployments.

Conclusion

2025 market analysis shows that the choice between DeepSeek and ChatGPT has become a strategic question of data control and cost optimization, not just a technological dilemma. Global companies implementing DeepSeek on their own infrastructure recoup investments of $84,200 in just 8-12 months, gaining full digital sovereignty and guaranteed compliance with strict GDPR and HIPAA standards. While DeepSeek API allows reducing operational costs by 35% through efficient caching, exclusive reliance on the OpenAI ecosystem creates critical business risks of vendor lock-in and inability to guarantee complete corporate information confidentiality.

avatar

Max Godymchyk

Entrepreneur, marketer, author of articles on artificial intelligence, art and design. Customizes businesses and makes people fall in love with modern technologies.

2026 Language Model: Moltbot – The Autonomous Personal AI Assistant That Actually Works!

February 04, 2026

Moltbot (formerly known as Clawdbot) has become one of the most talked-about technologies in the AI enthusiast world in early 2026. This open-source project promises not just to answer queries but to perform tasks for you—managing email, calendars, files, and applications.

But what is Moltbot really, is it worth running yourself, and what risks are associated with it? All this is covered in the detailed breakdown below.

What is Moltbot?

Moltbot is an open-source personal AI assistant that runs on your own computer or server and is capable of performing actions on behalf of the user, not just generating text. It operates 24/7, receives commands via messengers, and performs a variety of tasks: from managing messages to automating routine processes.

Moltbot is not just a chatbot; it's an action-oriented agent: it perceives messages, plans steps to achieve a goal, and activates relevant tools or functions on the user's device.

Project History and Its Creator

Behind Moltbot is an unusual developer—Peter Steinberger, a figure well-known in the Apple ecosystem. His journey is the story of a developer who first created a successful commercial product and then completely reoriented his vision of technology towards personal AI.

From PDF Libraries to Artificial Intelligence

Peter started his career in the early iPhone era, was actively involved in the Apple community CocoaHeads, and taught iOS development at Vienna Technical University. His main project for a long time was PSPDFKit—a powerful SDK for working with PDFs, sold not directly to users but to companies as a software component. It helped integrate PDF functionality into other products and applications.

In 2021, Peter sold his share in PSPDFKit—reportedly as part of a deal with the investment company Insight Partners. But, contrary to stereotypes about success, this deal became an emotional blow: Peter lost not just a project, but part of his identity. He candidly wrote in his blog about burnout, emptiness, loss of purpose, and unsuccessful attempts to reboot through parties, rest, or even therapy. Nothing helped. He was left without an idea he wanted to return to every morning.

AI as a Second Life

Everything changed in 2024-2025—when the boom of large language models reached a critical mass. Peter again felt the urge to create something new: now he was inspired by the idea of a personal AI that would live not in the cloud, but in your home, on your computer, with access to tasks, files, and habits.

Thus, Clawdbot was born—a home AI agent with a claw for a head and an emoji lobster as a mascot. It was conceived as a helper that actually does something useful, not just a talking head with an API. The name "Clawdbot" was a play on words: claw + Claude (the name of the beloved language model from Anthropic).

The project quickly gained popularity on microblogs, Reddit, and Hacker News: people began to massively share use cases, run the agent on Mac minis, and experiment with extending its capabilities.

Transition to Moltbot

In January 2026, Anthropic (creator of Claude) requested a change to the project's name to avoid confusion with their trademark. Peter took this calmly and renamed Clawdbot to Moltbot. The name became even more interesting in meaning: molt is "molting," the renewal process that real-life lobsters go through. Thus, Moltbot symbolized growth, renewal, evolution—of both the project and Peter himself.

Now the default chatbot is named Molty, and the entire project officially resides at: github.com/moltbot/moltbot.

The Personal Becomes Technical

From a technical perspective, Moltbot is a reflection of Peter's internal state: he has always been a developer who thinks in terms of infrastructure, platforms, and "for growth." Instead of making just another chatbot, he created a structure that can be developed, adapted, and extended for any task. It's not just an assistant—it's an entire ecosystem into which anyone can integrate their own logic, skills, and workflow.

And now, as he admits in interviews, Moltbot is not just a project, but a new form of presence, a new form of life he found after an emotional crisis and leaving big business.

Moltbot's Technical Architecture: How It Works

At first glance, Moltbot might seem like just a "smart chatbot," but in reality, it's a full-fledged architectural platform consisting of several layers. Everything is built to be simultaneously flexible, extensible, and autonomous. Below is an explanation of the system's internal structure.

Core Concept

Moltbot is an AI agent that runs on a local machine, processes messages, performs actions, and interacts with external language models (Claude, OpenAI, Mistral, etc.).

At the same time, it:

  • maintains internal memory (in the form of text files),
  • connects to chats and applications via gateways,
  • can run OS commands, read and change files,
  • and all this—in continuous operation mode, as a service.

Core Components

1. Clawd (Agent Core)

This is the "brain" of the system—the agent that lives on your machine (Mac, Linux, Raspberry Pi, or WSL), monitors conversations, context, commands, and tasks, organizes "memory," and launches "skills," communicates with the model via API, and crafts prompts. It's written in TypeScript and runs on Node.js (or Bun).

2. Gateway (External Communication)

This is the "gateway" that receives incoming messages from messengers and forwards them to the agent. It:

  • provides a management web interface (Control UI),
  • exposes an API for messages and WebSocket connections,
  • can work with bots in Telegram, WhatsApp, Discord, etc.,
  • can proxy connections (e.g., through a reverse proxy). 💡 By default, it listens on port 127.0.0.1:18789. For remote access, you need to change gateway.bind to 0.0.0.0 and ensure security (VPN, password, authorization).

3. Control UI (Local Interface)

A simple web interface based on Vite and Lit. Through it you can:

  • manage Moltbot's configuration,
  • view conversation logs,
  • control active channels and skills,
  • and even manually issue commands.

4. Skills

Each skill is an extension of the agent's functionality. It consists of a description (in Markdown or JSON format), code (in JavaScript, TypeScript, or Shell), arguments, and launch conditions.

Examples of skills:

  • Spotify control,
  • sending email,
  • working with Google Docs or Notion,
  • generating images via Stable Diffusion,
  • screenshots, audio transcription, script execution.

Skills can be written yourself or downloaded from ClawdHub / MoltHub.

Memory Structure

Moltbot's memory is simple yet powerful. It is implemented as regular text files:

  • memory/notes/YYYY-MM-DD.md – temporary notes,
  • memory/facts.md – stable information about the user (name, habits, contexts),
  • memory/history/ – log of communication and decisions made.

This allows for manual memory editing, control over what the bot "remembers," and copying or transferring data between devices.

Working with the Language Model

Moltbot does not contain its own model but connects to external APIs:

  • Anthropic Claude (recommended: Claude 3 or 4.5 Opus),
  • OpenAI GPT‑4 / GPT‑3.5,
  • Mistral, Gemini, Perplexity – via OpenRouter or other proxies.

All requests to the model go through Clawd and are accompanied by system prompts, memory and notes, situation descriptions, and user preferences.

Results from the model can immediately trigger commands, skills, or provide answers.

Installation and Configuration

During installation, Moltbot:

  • creates the ~/.moltbot/ directory,
  • saves the configuration file moltbot.json,
  • generates directories for skills, memory, and logs,
  • installs a system daemon (systemd or launchctl on Mac),
  • can automatically start the gateway and UI.

Security

This is a critically important component:

  1. By default, Moltbot is only accessible from the local machine.
  2. UI authorization is via token (gateway.auth.token).
  3. It is not recommended to expose the port directly to the internet.
  4. All API keys and tokens should be stored in secure environment variables.

Additionally, it is recommended to run it in an isolated system (e.g., a separate Mac mini), use VPN or SSH tunnels for external access, and periodically update and check the gateway configuration.

Architectural Features

  • Cross-platform: Works on Mac, Linux, Windows (via WSL), Raspberry Pi.
  • Modularity: You can change the core, model, channels, and skills independently.
  • Fault tolerance: Support for fallback models (in case the main provider is unavailable).
  • Fully transparent structure: Everything is stored in open files—no black boxes.

Capabilities and Integrations

Moltbot supports connections to numerous services and applications via "skills":

  • Managing messages via Telegram, WhatsApp, Discord, Slack, Signal, iMessage, and others.
  • Executing terminal commands and interacting with the local file system.
  • Integrations with calendars, email, reminders, Telegram bots, and more complex task flows.
  • Creating custom skills that can be exported to MoltHub—the community shares ready-made extensions.

Moltbot's key feature is that it is not limited to just answering but can perform actions at the system level.

Why Running on a Dedicated Device is Common Practice

Moltbot must run continuously—saving state, listening for events, and processing commands quickly. Running it on a laptop that frequently sleeps, disconnects from the network, or switches between networks disrupts its operation. Therefore, many enthusiasts prefer to set up a dedicated computer: often a Mac mini, but other devices (even a Raspberry Pi) will work.

The Mac mini became a popular choice due to its compactness, low power consumption, and integration with iMessage and other Apple services, which are harder to use on Linux.

Security Concerns – What You Need to Know

Moltbot's extended permissions are not only powerful but also a risk. Why?

Admin-level access to the system can lead to hacking if interfaces are exposed externally or misconfigured. Also, unprotected Control UIs can expose API keys, messenger tokens, and other secrets. Atomic attacks via prompt injection are possible, where malicious input can force Moltbot to perform unintended actions.

Due to its popularity, the project has already become a target for fake tokens and fraudulent schemes related to old names and meme coins. Therefore, developers and experts strongly recommend running Moltbot in an isolated environment, carefully configuring authorization, and avoiding exposing ports to the internet.

Practical Use Case Examples

Moltbot is capable of performing real tasks, but most stories are still experimental:

  • Automatic checking of email, calendars, and reminders.
  • Sending daily audio reports on user tasks and activity.
  • Managing notifications and integrating with cloud services.

However, stories about Moltbot buying a car by itself or fully organizing complex processes without user involvement remain rare and still require step-by-step human guidance.

In conclusion, Moltbot is one of the most impressive experiments with autonomous AI agents to date. It demonstrates how large language models can transition from chat to action, performing tasks, integrating with messengers and system tools.

But along with this, it requires technical expertise and careful security configuration, carries increased risk if deployed incorrectly, and for now remains a product for enthusiasts, not mainstream users.

If you want to try Moltbot—do so cautiously, on dedicated hardware, considering all risks. And for those seeking stability and security, it might be better to wait until the architecture of such agents matures further.

avatar

Max Godymchyk

Entrepreneur, marketer, author of articles on artificial intelligence, art and design. Customizes businesses and makes people fall in love with modern technologies.