Two transport choices
When an app calls a model service it needs a protocol. REST sends JSON over HTTP and is human readable and universal. gRPC sends compact binary messages over HTTP and is built for fast machine to machine calls.
Where REST shines
- Easy to debug with a browser or simple tools.
- Works everywhere, including web pages and scripts.
- Good for low volume or external facing endpoints.
Where gRPC shines
- Binary encoding makes payloads smaller and parsing faster.
- Strong typed contracts catch mistakes early.
- Streaming lets the server push tokens or partial results as they form.
The tradeoff
REST is friendlier; gRPC is faster and leaner. For a public API you want REST simplicity. For high volume internal traffic between services, gRPC cuts latency and bandwidth, which matters when you serve thousands of predictions a second.
Key idea
REST trades speed for reach and readability; gRPC trades convenience for compact, fast, typed calls. Pick REST at the edge and gRPC for heavy internal inference traffic.