$SHIMMY
v1.0.0
Twitter
Community
About
Docs
Model Fleet
6 models
llama-3.1-70b
ACTIVE
GGUF
70B
38.2 GB VRAM
42 tok/s
mistral-7b-v0.3
READY
GGUF
7B
4.1 GB VRAM
128 tok/s
qwen2.5-32b
READY
SafeTensors
32B
18.6 GB VRAM
67 tok/s
phi-3-medium
STANDBY
GGUF
14B
7.8 GB VRAM
95 tok/s
deepseek-r1-8b
STANDBY
SafeTensors
8B
4.8 GB VRAM
112 tok/s
command-r-35b
STANDBY
GGUF
35B
19.4 GB VRAM
58 tok/s
SETUP
Mac Mini · M-series · Local inference
4 steps → autonomous
1
Install runtime
copy
$
curl -fsSL https://shimmy.sh/install | sh
Single binary, no Python, no Docker
2
Pull a model
copy
$
shimmy pull llama-3.1-70b --quant q4
GGUF / SafeTensors, auto-quantize to fit VRAM
3
Start serving
copy
$
shimmy serve --port 8080 --api openai
OpenAI-compatible API on your local network
4
Deploy agent
copy
$
shimmy agent start --strategy sniper
Autonomous trading via on-chain skills
AGENT SKILLS
6 available
▸
token-scanner
Real-time Solana token discovery via DexScreener
▸
liquidity-probe
Deep liquidity & holder distribution analysis
▸
sniper-entry
Sub-second buy execution on new pairs
▸
exit-engine
Auto-extract on first pump, configurable targets
▸
rug-detector
Contract analysis, mint authority, LP lock checks
▸
wallet-cluster
Cross-reference wallet patterns for insider detection
shimmy v1.0.0 · Apple Silicon optimized
zero cloud fees · your hardware · your edge
Model
llama-3.1-70b
Latency
0ms
Throughput
0 tok/s
Total Tokens
0
IDLE
Initializing token analysis engine...
shimmy inference · on-chain token analysis
model: llama-3.1-70b · rust inference engine
Initializing token scanner...
Fleet operational · Click any model to hot-swap
shimmy v1.0.0 · rust · apache-2.0