Home

Awesome

<a href="#"> <img src="https://github.com/user-attachments/assets/27778034-29f5-4a71-b696-4e3f70760b26" > </a> </p>

What is GCA?

Hi, this is an open source framework to build vertical AI agent. We just support many llms and new technologies like mcp. You can build your own vertical ai agent army in few commands with the stucturized API.

<p> <p > <a href="https://www.producthunt.com/posts/gpt-computer-assistant?embed=true&utm_source=badge-top-post-badge&utm_medium=badge&utm_souce=badge-gpt&#0045;computer&#0045;assistant" target="_blank"><img src="https://api.producthunt.com/widgets/embed-image/v1/top-post-badge.svg?post_id=465468&theme=dark&period=daily" alt="GPT&#0032;Computer&#0032;Assistant - Create&#0032;intelligence&#0032;for&#0032;your&#0032;products | Product Hunt" width="200" /></a> . <a href="https://discord.gg/qApFmWMt8x"><img alt="Static Badge" src="https://img.shields.io/badge/Discord-Join?style=social&logo=discord" width=120></a> . <a href="https://x.com/GPTCompAsst"><img alt="Static Badge" src="https://img.shields.io/badge/X_App-Join?style=social&logo=x" width=100></a> </p> <p> <a href="https://www.python.org/"> <img src="https://img.shields.io/badge/Made%20with-Python-1f425f.svg" alt="Made_with_python"> </a> . <img src="https://static.pepy.tech/personalized-badge/gpt-computer-assistant?period=total&units=international_system&left_color=grey&right_color=blue&left_text=PyPI%20Downloads" alt="pypi_downloads"> </p> <p align="center"> <br> <br> </p>

Playground of GCA | NEW

With playground.gca.dev you are ready to test and create your own strategies for creating an Vertical AI Agent.

<a href="https://playground.gca.dev/"> <img src="https://github.com/user-attachments/assets/125a1a15-0fee-4c7e-bfc5-1a23ef83c92d" alt="Playground" width=1000> </a> <p align="center"> <br> <br> </p>

GPT Computer Assistant(GCA)

GCA is an AI agent framework designed to make computer use across Windows, macOS, and Ubuntu. GCA enables you to replace repetitive, small-logic-based tasks worker to an AI. There is an really important potential that we believe. Whether you’re a developer, analyst, or IT professional, GCA can empower you to accomplish more in less time.

Imagine this:

These examples shows how GCA is realize the concept of <b>Vertical AI Agents</b> solutions that not only replicate human tasks, GCA also in the beyond of human speed at same cases.

<p align="center"> <br> <br> </p>

How GCA Works?

GCA is a Python-based project that runs on multiple operating systems, including Windows, macOS, and Ubuntu. It integrates external concepts, like the Model Context Protocol (MCP), along with its own modules, to interact with and control a computer efficiently. The system performs both routine and advanced tasks by mimicking human-like actions and applying computational precision.

1. Human-like Actions:

GCA can replicate common user actions, such as:

2. Advanced Capabilities:

Through MCP and GCA’s own modules, it achieves tasks that go beyond standard human interaction, such as:

<p align="center"> <br> <br> <br> </p>

Prequisites

<p align="center"> <br> <br> </p>

Using GCA.dev Cloud

<b>Installation</b>

pip install gpt-computer-assistant

Single Instance:

from gpt_computer_assistant import cloud

# Starting instance
instance = cloud.instance()

# Show Screenshot
instance.current_screenshot()

# Asking and getting result
result = instance.request("Extract the tech stacks of gpt-computer-assitant Company", "i want a list")
print(result)


instance.close()
<img src="https://github.com/user-attachments/assets/3fd70530-6b86-43b4-9025-dce7853e4a38" alt="Cloud" width=1000> <p align="center"> <br> <br> <br> </p>

Self-Hosted GCA Server

Docker

Pulling Image

docker pull upsonic/gca_docker_ubuntu:dev0-AMD64

Starting container

docker run -d -p 5901:5901 -p 7541:7541 upsonic/gca_docker_ubuntu:dev0-AMD64

LLM Settings&Using

from gpt_computer_assistant import docker

# Starting instance
instance = docker.instance("http://localhost:7541/")

# Connecting to OpenAI and Anthropic
instance.client.save_model("gpt-4o")
instance.client.save_openai_api_key("sk-**")
instance.client.save_anthropic_api_key("sk-**")

# Asking and getting result
result = instance.request("Extract the tech stacks of gpt-computer-assitant Company", "i want a list")
print(result)

instance.close()
<p align="center"> <br> <br> </p>

Local

<b>Installation</b>

pip install 'gpt-computer-assistant[base]'
pip install 'gpt-computer-assistant[api]'

<b>LLM Settings&Using</b>

from gpt_computer_assistant import local

# Starting instance
instance = local.instance()

# Connecting to OpenAI and Anthropic
instance.client.save_model("gpt-4o")
instance.client.save_openai_api_key("sk-**")
instance.client.save_anthropic_api_key("sk-**")

# Asking and getting result
result = instance.request("Extract the tech stacks of gpt-computer-assitant Company", "i want a list")
print(result)

instance.close()
<img width="1000" src="https://github.com/user-attachments/assets/327cdceb-49e7-4a8a-a724-e386553f43d8"> <p align="center"> <br> <br> <br> </p>

Adding Custom MCP Server to GCA

instance.client.add_mcp_server("websearch", "npx", ["-y", "@mzxrai/mcp-webresearch"])

Roadmap

FeatureStatusTarget Release
Clear Chat HistoryCompletedQ2 2024
Long Audios Support (Split 20mb)CompletedQ2 2024
Text InputsCompletedQ2 2024
Just Text Mode (Mute Speech)CompletedQ2 2024
Added profiles (Different Chats)CompletedQ2 2024
More Feedback About Assistant StatusCompletedQ2 2024
Local Model Vision and Text (With Ollama, and vision models)CompletedQ2 2024
Our Customizable Agent InfrastructureCompletedQ2 2024
Supporting Groq ModelsCompletedQ2 2024
Adding Custom ToolsCompletedQ2 2024
Click on something on the screen (text and icon)CompletedQ2 2024
New UICompletedQ2 2024
Native Applications, exe, dmgCompletedQ3 2024
Collaborated Speaking Different Voice Models on long responses.CompletedQ2 2024
Auto Stop Recording, when you complate talkingCompletedQ2 2024
Wakeup WordCompletedQ2 2024
Continuously ConversationsCompletedQ2 2024
Adding more capability on deviceCompletedQ2 2024
Local TTSCompletedQ3 2024
Local STTCompletedQ3 2024
Tray MenuCompletedQ3 2024
New Line (Shift + Enter)CompletedQ4 2024
Copy Pasting Text CompatibilityCompletedQ4 2024
Global HotkeyOn the wayQ3 2024
DeepFace Integration (Facial Recognition)PlannedQ3 2024

Capabilities

At this time we have many infrastructure elements. We just aim to provide whole things that already in ChatGPT app.

CapabilityStatus
Local LLM with Vision (Ollama)OK
Local text-to-speechOK
Local speech-to-textOK
Screen ReadOK
Click to and Text or Icon in the screenOK
Move to and Text or Icon in the screenOK
Typing SomethingOK
Pressing to Any KeyOK
ScrollingOK
MicrophoneOK
System AudioOK
MemoryOK
Open and Close AppOK
Open a URLOK
ClipboardOK
Search EnginesOK
Writing and running PythonOK
Writing and running SHOK
Using your Telegram AccountOK
Knowledge ManagementOK
Add more tool?

Predefined Agents

If you enable it your assistant will work with these teams:

Team NameStatus
search_on_internet_and_report_teamOK
generate_code_with_aim_team_OK
Add your own one?
<a href="#"> <img src="https://github.com/onuratakan/gpt-computer-assistant/assets/41792982/ba590bf8-6059-4cb6-8c4e-6d105ce4edd2" alt="Logo" > </a>

Contributors

<a href="https://github.com/upsonic/gpt-computer-assistant/graphs/contributors"> <img src="https://contrib.rocks/image?repo=upsonic/gpt-computer-assistant" /> </a>