Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated Outputs Issue with Shot2Story Dataset Stage3 Fine-tuning #43

Open
katie312 opened this issue Nov 14, 2024 · 0 comments
Open

Repeated Outputs Issue with Shot2Story Dataset Stage3 Fine-tuning #43

katie312 opened this issue Nov 14, 2024 · 0 comments

Comments

@katie312
Copy link

Hi,

I'm experiencing repeated outputs after fine-tuning with the Shot2Story dataset. I've kept the fine-tuning code and parameters at their defaults and only modified the JSON format of the dataset. Feature extraction is done using CLIP.

json format:

{
        "id": "--_JTICeYtk",
        "conversations": [
            {
                "from": "human",
                "value": "Can you provide a summary of the events shown in the video?"
            },
            {
                "from": "gpt",
                "value": "In the video, a man dressed in black stands against a red brick wall while an athlete in a white jersey runs by him. The scene then transitions to a man throwing a ball at three people on a pitch, and the man in white hits it. The camera follows the ball as it is quickly passed from one player to another, with a line mentioned being driven over the shortstop. A player, dressed in white, runs on the field, high-fiving teammates, as the background voice announces that Daniel Robert gives a two to nothing lead with an RBI single to left by a freshman designated hitter. The scene alternates back to the man throwing the ball to three people, and the man in white hitting it multiple times. Players sprint across the field, as the text 'AHD 2ND INNING' appears on a blue background. The scene continues to show the man throwing the ball and the man in white hitting it repeatedly. The action escalates as a white ball is missed by an athlete but quickly retrieved by another player from the right side of the screen. The video concludes with two players in white jerseys standing on the field amidst others, while the commentator mentions a bad hop to the shortstop. "
            },
            {
                "from": "human",
                "value": "What details can be observed about the player in the scene where Daniel Robert gives the lead with an RBI single?"
            },
            {
                "from": "gpt",
                "value": "In that scene, Daniel Robert is noted for giving his team a two to nothing lead with an RBI single to left field, as highlighted by the background voice in the video's captions. This play is noted by the commentary as a meaningful contribution from Daniel Robert, highlighting the player's impact on the game's score at this point. "
            },
            {
                "from": "human",
                "value": "During the sequence with the man in black, what can be noted about the actions of the athlete in white?"
            },
            {
                "from": "gpt",
                "value": "In the sequence with the man standing against the red brick wall, the athlete in white is seen running past him. The scene transitions from the man in black to the athlete in white moving dynamically as he runs past, suggesting a shift in focus toward the athletic activity. The athlete's dynamic movement indicates a transition from a passive scenario, where the focus was on the man in black, to an active scene centered around the athlete's involvement in the action. "
            }
        ],
        "meta": {
            "duration": null
        }
    }


I've checked the dataset integrity and input consistency but can't identify the cause of the duplicates. Any guidance on this issue would be appreciated.

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant