In the olden days, most people rolled their own servers to expose their ML models to client applications. In fact, Roboflow Inference's HTTP interface and REST API are built on FastAPI.
In this day and age, it's certainly still possible to start from scratch, but you'll be reinventing the wheel and will run into a lot of footguns others have already solved along the way. It's usually better and faster to use one of the existing ML-focused servers.
Choose FastAPI or Flask if: your main goal is learning the intricacies ofmaking an inference server.
Inference turns any computer or edge device into a command center for your computer vision projects.
Inference turns any computer or edge device into a command center for your computer vision projects.