Skip to content

Questions regarding data format #7

Description

@HANJU0427

Hello, we have been considering using your Nav-COT-38K dataset recently, but I'm not quite clear about their annotation format. Specifically, each individual piece of data in the jsonl file should represent the analysis of the agent's current state and the proposed next action at that location. That is to say, each piece of data corresponds to a decision at a specific time step, rather than like the R2R dataset where each piece of data corresponds to an entire trajectory. Each of your data sets provides 6 images, but when I checked them, I couldn't understand the relationship between these images. I will give an example to illustrate this below.
For the same trajectory ID (7), this is the original annotation by R2R:

"distance": 15.79, "scan": "VzqfbhrpDEA", "path_id": 7, "path": [ "c7096f5da9c94c709477ff0d88635e14", "f74898f691214cd784c6af1ce5b53f6a", "0f5d0c2dea9b4929a9c3c37ef159b360", "4faab9d5f15a40fa8620efc0ad9e9db2", "b867ba11a0e84506aaeb257e1ef85388", "8b7469d12514415cada466823f2a7063", "9177fe5f923242f0b1ba16c7a058e5ac" ], "heading": 4.925, "instructions": [ "With the sink on your left go around the counter and into the main room taking a left in the hallway and walking down the hall going right and stopping at the top of the stairs in the hallway.", "Exit the bar area. Turn left and walk down the hall, then turn right. Walk slightly to the left, toward the stairs. Wait by the stairs.", "Walkdown the hallway to the right of the billiards table. Stop at the top of the staircase." ],

Its trajectory corresponds to 7 panoramic images from MP3D data:

Image

It can be seen from these pictures that the positions they show are interrelated.

The following is the data corresponding to the trajectory IDs extracted from your jsonl file.

{"id": "r2r_train_VzqfbhrpDEA_ep10_t001", "source": "R2R-CE", "split": "train", "episode_id": 10, "trajectory_id": 7, "scene_id": "VzqfbhrpDEA", "timestep": 1, "total_steps": 5, "instruction": "With the sink on your left go around the counter and into the main room taking a left in the hallway and walking down the hall going right and stopping at the top of the stairs in the hallway. ", "current_position": [-15.667499542236328, 3.491008996963501, -2.1663599014282227], "current_heading_rad": -2.3044757572012675, "next_position": [-17.282800674438477, 3.491008996963501, -3.622659921646118], "action_label": "turn left 13° | move forward 217cm", "geodesic_distance": 17.636335372924805, "metadata": {"goal_position": [-24.513900756835938, 3.491008996963501, -8.207169532775879], "start_position": [-11.777000427246094, 3.491008996963501, -0.05791720002889633]}, "images": ["F:\MP3D\v1\scans\VzqfbhrpDEA\VzqfbhrpDEA\matterport_skybox_images\ca9d8289f61244288b696bf832fa2713_skybox3_sami.jpg", "F:\MP3D\v1\scans\VzqfbhrpDEA\VzqfbhrpDEA\matterport_skybox_images\24360576a503468b92c6bbb1451e832f_skybox3_sami.jpg", "F:\MP3D\v1\scans\VzqfbhrpDEA\VzqfbhrpDEA\matterport_skybox_images\611e887f347a4bd4afbf09666ef657ac_skybox2_sami.jpg", "F:\MP3D\v1\scans\VzqfbhrpDEA\VzqfbhrpDEA\matterport_skybox_images\d70f2efc18f34b57a642332228e899d7_skybox2_sami.jpg", "F:\MP3D\v1\scans\VzqfbhrpDEA\VzqfbhrpDEA\matterport_skybox_images\0df37f2d82bd48fb963df1839e33fd32_skybox5_sami.jpg", "F:\MP3D\v1\scans\VzqfbhrpDEA\VzqfbhrpDEA\matterport_skybox_images\2de328d1e8954da6a4f8027a2d6092f0_skybox4_sami.jpg"], "image_clip_scores": [0.304443359375, 0.298828125, 0.2978515625, 0.29541015625, 0.295166015625, 0.294677734375], "num_candidate_images": 1854, "model": "gemini-2.5-flash", "prompt_version": "cot_v1.0", "response_raw": "think\nScene Analysis: The initial images show a bathroom or utility room with a sink/counter on the right (Image 4) and a more elaborate bathroom vanity with a sink in Image 2. Image 3 shows a laundry basket on a counter with a sink to the right. Images 1 and 6 show a hallway with multiple dark wood doors. Image 5 shows the floor and the bottom of a door. The starting point seems to be near the counter/sink area. The instruction mentions "the sink on your left". Images 3 and 4 show a sink on the right. Image 2 shows a sink. Based on the instruction, the robot is likely in a room and needs to navigate out. The hallway in images 1 and 6 seems to be the main room/hallway.\n\nInstruction Goal: Go around the counter (where the sink is on the left), enter the main room, take a left in the hallway, walk down the hall, turn right, and stop at the top of the stairs in the hallway.\n\nNavigation Plan: Currently, the sink is on the robot's right (from Image 4). The instruction says "with the sink on your left". This means the robot needs to turn around first. From Image 4, turning around 180 degrees would place the sink on the left. Then, the robot needs to go around the counter. Images 3 and 4 show a doorway. This doorway leads to the hallway visible in Image 1 and 6. I will assume the robot is positioned such that the counter is to its right, and it needs to turn 180 degrees to have the counter and sink on its left. Then, it needs to move forward to go around the counter and into the main room. I will estimate turning 180 degrees and moving forward to clear the counter and enter the hallway.\n\nScene Analysis: Images 3 and 4 show a room with a sink and counter. Image 4 shows the sink on the right. Image 3 shows a laundry basket on a counter, implying a utility or laundry room. Images 1 and 6 show a hallway with dark wood doors and light-colored walls, leading to other rooms. Image 2 shows a large bathroom vanity with a sink. The instruction states "with the sink on your left", which means the robot is currently facing away from the sink shown in Image 4.\n\nInstruction Goal: The robot needs to turn to have the sink on its left, go around the counter, enter the main room, turn left in the hallway, walk down the hall, turn right, and stop at the top of the stairs.\n\nNavigation Plan: The current visual context (Image 4) shows a sink on the right. To have the sink on the left, the robot must turn 180 degrees. Then, it needs to move forward to go around the counter and exit this room into the hallway (main room). The doorway in Images 3 and 4 leads to the hallway seen in Images 1 and 6. I will turn 180 degrees and move forward into the hallway.\n\n\nturn right 180° | move forward 150cm\n", "think": "Scene Analysis: Images 3 and 4 show a room with a sink and counter. Image 4 shows the sink on the right. Image 3 shows a laundry basket on a counter, implying a utility or laundry room. Images 1 and 6 show a hallway with dark wood doors and light-colored walls, leading to other rooms. Image 2 shows a large bathroom vanity with a sink. The instruction states "with the sink on your left", which means the robot is currently facing away from the sink shown in Image 4.\n\n
Instruction Goal: The robot needs to turn to have the sink on its left, go around the counter, enter the main room, turn left in the hallway, walk down the hall, turn right, and stop at the top of the stairs.\n\n
Navigation Plan: The current visual context (Image 4) shows a sink on the right. To have the sink on the left, the robot must turn 180 degrees. Then, it needs to move forward to go around the counter and exit this room into the hallway (main room). The doorway in Images 3 and 4 leads to the hallway seen in Images 1 and 6. I will turn 180 degrees and move forward into the hallway.", "action": "turn right 180° | move forward 150cm", "format_ok": true, "api_success": true, "api_retries": 0, "error": "", "filter_tier": "ok", "filter_info": {"match_rate": 0.5, "avg_similarity": 0.6912442396313364, "common_types": ["move"], "pred_types": ["move", "turn_right"], "gt_types": ["turn_left", "move"], "similarities": {"move": 0.6912442396313364}}}

There are 6 images. The result of extracting the panoramic images corresponding to these 6 images is as follows:

Image

The rooms shown in these pictures do not seem to be connected. Moreover, they are inconsistent with the scenes depicted in the annotations of the R2R dataset. I would like to know how these images were collected according to what rules, and how they were used during the training process?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions