RetinaNet Keras uses its own annotation format where all the annotations are in a single file. Each line represents one bounding box.
Writing a script to convert your data to try one specific model can be time consuming and error-prone. Why waste the time when you can just use Roboflow to convert your data for you?
0001.jpg,2694,1211,353,353,helmet 0001.jpg,1754,1449,68,68,person 0002.jpg,113,95,226,226,helmet 0003.jpg,352,114,151,151,helmet 0003.jpg,799,217,139,139,person 0004.jpg,162,126,124,124,helmet