Python-3.x support #27

kmario23 · 2018-01-19T19:20:28Z

Hi all,
I'd like to know whether you have plans to port the codebase to Python-3. Since most of the people have switched to Python-3, it'd be nice to have Python-3 support so that other projects (for e.g. ImageCaptioning PyTorch ) dependent on coco-caption can also be implemented in Python-3.

Thanks!

The text was updated successfully, but these errors were encountered:

xiadingZ · 2018-01-20T01:04:52Z

I have ported it to python3 version, but meteor metrix doesn't work. You can have a see. coco-caption

salaniz · 2018-02-06T14:46:50Z

I have implemented Python 3 support for the evaluation metrics.
Have a look at my comment here: ruotianluo/ImageCaptioning.pytorch#36 (comment)

I am using my version of the eval tools together with the pycocotools from here: https://github.com/cocodataset/cocoapi

mtanti · 2018-05-27T08:57:36Z

I have created a fork that is both Python 3 compatible and that uses the new Word Mover's Distance metric. It would be nice to merge with this repository.

https://github.com/mtanti/coco-caption

entalent · 2018-08-29T02:31:11Z

I just modified the code to support Python 3, with support for Chinese.
https://github.com/entalent/coco-caption-py3/blob/master/README.md
It was created in a hurry...so there might be bugs.

rubencart · 2019-02-19T13:48:32Z

What's the status on this? :)

mtanti · 2019-02-19T19:39:06Z

@rubencart They said "We are currently focusing on more of the object detection / segmentation challenges, and have decided to leave the captioning leaderboard open but not make additional updates to it."

ozancaglayan · 2019-08-01T13:21:39Z

Another pure Python 3.x fork with no support for Python 2 with some tiny bugs fixed as well --> https://github.com/ozancaglayan/coco-caption

HYPJUDY · 2019-10-08T07:42:54Z

Thanks for your contribution.
Based on @mtanti 's implementation, I modified two places to support meteor evalution for both py2 and py3.

It seems that the code of

        score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str))
        self.meteor_p.stdin.write(score_line+'\n')

cannot support py2 and I changed it to

        if sys.version_info[0] == 2:  # python2
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
            self.meteor_p.stdin.write(str(score_line+b'\n'))
        else:  # assume python3+
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
            self.meteor_p.stdin.write(score_line+'\n')

Add a judgement in compute_score

            # There's a situation that the prediction is all punctuations
            # (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
            # then the prediction will become [''] after tokenization
            # which means res[i][0] == '' and self._stat will failed with this input
            if len(res[i][0]) == 0:
                res[i][0] = 'a'

The complete code of meteor.py is as following

#!/usr/bin/env python

# Python wrapper for METEOR implementation, by Xinlei Chen
# Acknowledge Michael Denkowski for the generous discussion and help 
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import sys
import subprocess
import threading

# Assumes meteor-1.5.jar is in the same directory as meteor.py.  Change as needed.
METEOR_JAR = 'meteor-1.5.jar'
# print METEOR_JAR

class Meteor:

    def __init__(self):
        self.env = os.environ
        self.env['LC_ALL'] = 'en_US.UTF_8'
        self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR,
                '-', '-', '-stdio', '-l', 'en', '-norm']
        self.meteor_p = subprocess.Popen(self.meteor_cmd,
                cwd=os.path.dirname(os.path.abspath(__file__)),
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                env=self.env, universal_newlines=True, bufsize=1)
        # Used to guarantee thread safety
        self.lock = threading.Lock()

    def compute_score(self, gts, res):
        assert(gts.keys() == res.keys())
        imgIds = sorted(list(gts.keys()))
        scores = []

        eval_line = 'EVAL'
        self.lock.acquire()
        for i in imgIds:
            assert(len(res[i]) == 1)
            # There's a situation that the prediction is all punctuations
            # (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
            # then the prediction will become [''] after tokenization
            # which means res[i][0] == '' and self._stat will failed with this input
            if len(res[i][0]) == 0:
                res[i][0] = 'a'
            stat = self._stat(res[i][0], gts[i])
            eval_line += ' ||| {}'.format(stat)

        # Send to METEOR
        self.meteor_p.stdin.write(eval_line + '\n')
        
        # Collect segment scores
        for i in range(len(imgIds)):
            score = float(self.meteor_p.stdout.readline().strip())
            scores.append(score)

        # Final score
        final_score = float(self.meteor_p.stdout.readline().strip())
        self.lock.release()

        return final_score, scores

    def method(self):
        return "METEOR"

    def _stat(self, hypothesis_str, reference_list):
        # SCORE ||| reference 1 words ||| reference n words ||| hypothesis words
        hypothesis_str = hypothesis_str.replace('|||', '').replace('  ', ' ')
        if sys.version_info[0] == 2:  # python2
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
            self.meteor_p.stdin.write(str(score_line+b'\n'))
        else:  # assume python3+
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
            self.meteor_p.stdin.write(score_line+'\n')
        return self.meteor_p.stdout.readline().strip()
 
    def __del__(self):
        self.lock.acquire()
        self.meteor_p.stdin.close()
        self.meteor_p.kill()
        self.meteor_p.wait()
        self.lock.release()

mtanti · 2019-10-08T07:59:18Z

Your code assumes that there will only ever be a version 2 and 3 for python. Don't assume that if the version is not 3 then it is 2. Instead check if it is 2 and if not then assume that the code for version 3 will work in the future as well. So switch your if/else around to 'if sys.version_info[0] == 2: ... else: ...

…

On Tue, 8 Oct 2019, 09:42 Yupan Huang, ***@***.***> wrote: Thanks for your contribution. Based on @mtanti <https://github.com/mtanti> 's implementation, I modified two places to support meteor evalution for both py2 and py3. 1. It seems that the code of score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)) self.meteor_p.stdin.write(score_line+'\n') cannot support py2 and I changed it to if sys.version_info[0] == 3: # python3 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip() self.meteor_p.stdin.write(score_line+'\n') else: # python2 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip() self.meteor_p.stdin.write(str(score_line+b'\n')) 1. Add a judgement in compute_score # There's a situation that the prediction is all puctuations # see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py # then the prediction will become [''] after tokenization # which means res[i][0] == '' and self._stat will failed with this input if len(res[i][0]) == 0: res[i][0] = 'a' The complete code of meteor.py is as following #!/usr/bin/env python # Python wrapper for METEOR implementation, by Xinlei Chen# Acknowledge Michael Denkowski for the generous discussion and help from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_function import osimport sysimport subprocessimport threading # Assumes meteor-1.5.jar is in the same directory as meteor.py. Change as needed.METEOR_JAR = 'meteor-1.5.jar'# print METEOR_JAR class Meteor: def __init__(self): self.env = os.environ self.env['LC_ALL'] = 'en_US.UTF_8' self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR, '-', '-', '-stdio', '-l', 'en', '-norm'] self.meteor_p = subprocess.Popen(self.meteor_cmd, cwd=os.path.dirname(os.path.abspath(__file__)), stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=self.env, universal_newlines=True, bufsize=1) # Used to guarantee thread safety self.lock = threading.Lock() def compute_score(self, gts, res): assert(gts.keys() == res.keys()) imgIds = sorted(list(gts.keys())) scores = [] eval_line = 'EVAL' self.lock.acquire() for i in imgIds: assert(len(res[i]) == 1) # There's a situation that the prediction is all puctuations # see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py # then the prediction will become [''] after tokenization # which means res[i][0] == '' and self._stat will failed with this input if len(res[i][0]) == 0: res[i][0] = 'a' stat = self._stat(res[i][0], gts[i]) eval_line += ' ||| {}'.format(stat) # Send to METEOR self.meteor_p.stdin.write(eval_line + '\n') # Collect segment scores for i in range(len(imgIds)): score = float(self.meteor_p.stdout.readline().strip()) scores.append(score) # Final score final_score = float(self.meteor_p.stdout.readline().strip()) self.lock.release() return final_score, scores def method(self): return "METEOR" def _stat(self, hypothesis_str, reference_list): # SCORE ||| reference 1 words ||| reference n words ||| hypothesis words hypothesis_str = hypothesis_str.replace('|||', '').replace(' ', ' ') if sys.version_info[0] == 3: # python3 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip() self.meteor_p.stdin.write(score_line+'\n') else: # python2 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip() self.meteor_p.stdin.write(str(score_line+b'\n')) return self.meteor_p.stdout.readline().strip() def __del__(self): self.lock.acquire() self.meteor_p.stdin.close() self.meteor_p.kill() self.meteor_p.wait() self.lock.release() — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#27?email_source=notifications&email_token=ABLFWDZA7EXTKJ5V6TN75SDQNQ2YDA5CNFSM4EMTXEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEATHB5Y#issuecomment-539390199>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABLFWD4E2MNIUXJV3RLSCVDQNQ2YDANCNFSM4EMTXECQ> .

ozancaglayan · 2019-10-08T09:42:14Z

Python 2 will be end-of-life next year. Why do you bother supporting it still?

HYPJUDY · 2019-10-08T10:16:42Z

Thanks @mtanti for pointing it out! I've modified the code.
@ozancaglayan Since I use the code of some repositories which only support python2 originally, I am transferring to python3 and switch between them to test the performance.

MarcusNerva · 2020-04-01T02:05:04Z

Thanks for your contribution.
Based on @mtanti 's implementation, I modified two places to support meteor evalution for both py2 and py3.

It seems that the code of

        score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str))
        self.meteor_p.stdin.write(score_line+'\n')

cannot support py2 and I changed it to

        if sys.version_info[0] == 2:  # python2
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
            self.meteor_p.stdin.write(str(score_line+b'\n'))
        else:  # assume python3+
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
            self.meteor_p.stdin.write(score_line+'\n')

Add a judgement in compute_score

            # There's a situation that the prediction is all punctuations
            # (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
            # then the prediction will become [''] after tokenization
            # which means res[i][0] == '' and self._stat will failed with this input
            if len(res[i][0]) == 0:
                res[i][0] = 'a'

The complete code of meteor.py is as following

#!/usr/bin/env python

# Python wrapper for METEOR implementation, by Xinlei Chen
# Acknowledge Michael Denkowski for the generous discussion and help 
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import sys
import subprocess
import threading

# Assumes meteor-1.5.jar is in the same directory as meteor.py.  Change as needed.
METEOR_JAR = 'meteor-1.5.jar'
# print METEOR_JAR

class Meteor:

    def __init__(self):
        self.env = os.environ
        self.env['LC_ALL'] = 'en_US.UTF_8'
        self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR,
                '-', '-', '-stdio', '-l', 'en', '-norm']
        self.meteor_p = subprocess.Popen(self.meteor_cmd,
                cwd=os.path.dirname(os.path.abspath(__file__)),
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                env=self.env, universal_newlines=True, bufsize=1)
        # Used to guarantee thread safety
        self.lock = threading.Lock()

    def compute_score(self, gts, res):
        assert(gts.keys() == res.keys())
        imgIds = sorted(list(gts.keys()))
        scores = []

        eval_line = 'EVAL'
        self.lock.acquire()
        for i in imgIds:
            assert(len(res[i]) == 1)
            # There's a situation that the prediction is all punctuations
            # (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
            # then the prediction will become [''] after tokenization
            # which means res[i][0] == '' and self._stat will failed with this input
            if len(res[i][0]) == 0:
                res[i][0] = 'a'
            stat = self._stat(res[i][0], gts[i])
            eval_line += ' ||| {}'.format(stat)

        # Send to METEOR
        self.meteor_p.stdin.write(eval_line + '\n')
        
        # Collect segment scores
        for i in range(len(imgIds)):
            score = float(self.meteor_p.stdout.readline().strip())
            scores.append(score)

        # Final score
        final_score = float(self.meteor_p.stdout.readline().strip())
        self.lock.release()

        return final_score, scores

    def method(self):
        return "METEOR"

    def _stat(self, hypothesis_str, reference_list):
        # SCORE ||| reference 1 words ||| reference n words ||| hypothesis words
        hypothesis_str = hypothesis_str.replace('|||', '').replace('  ', ' ')
        if sys.version_info[0] == 2:  # python2
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
            self.meteor_p.stdin.write(str(score_line+b'\n'))
        else:  # assume python3+
            score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
            self.meteor_p.stdin.write(score_line+'\n')
        return self.meteor_p.stdout.readline().strip()
 
    def __del__(self):
        self.lock.acquire()
        self.meteor_p.stdin.close()
        self.meteor_p.kill()
        self.meteor_p.wait()
        self.lock.release()

Thanks, your solution help me solve the proc.stdout.readline() hanged problem!

kracwarlock · 2020-12-23T15:58:41Z

I just stumbled across this and our https://github.com/Maluuba/nlg-eval supports Python 3

kmario23 · 2020-12-25T05:03:46Z

Hi all,
I'd like to know whether you have plans to port the codebase to Python-3. Since most of the people have switched to Python-3, it'd be nice to have Python-3 support so that other projects (for e.g. ImageCaptioning PyTorch ) dependent on coco-caption can also be implemented in Python-3.

Thanks!

It has been 3 years since I first commented and a lot has changed in the meantime. So, I'm now working with a much more elegant toolkit, facebookresearch/vizseq, which supports visualization with extension to multiple modalities (video, audio) and more recent embedding-based metrics.

flauted linked a pull request Jul 2, 2018 that will close this issue

Python 2+3 and pip #32

Open

fengyang0317 mentioned this issue Jul 2, 2019

Error Running im_caption.py fengyang0317/unsupervised_captioning#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python-3.x support #27

Python-3.x support #27

kmario23 commented Jan 19, 2018

xiadingZ commented Jan 20, 2018

salaniz commented Feb 6, 2018

mtanti commented May 27, 2018 •

edited

Loading

entalent commented Aug 29, 2018 •

edited

Loading

rubencart commented Feb 19, 2019

mtanti commented Feb 19, 2019

ozancaglayan commented Aug 1, 2019

HYPJUDY commented Oct 8, 2019 •

edited

Loading

mtanti commented Oct 8, 2019 via email

ozancaglayan commented Oct 8, 2019

HYPJUDY commented Oct 8, 2019

MarcusNerva commented Apr 1, 2020

kracwarlock commented Dec 23, 2020

kmario23 commented Dec 25, 2020

Python-3.x support #27

Python-3.x support #27

Comments

kmario23 commented Jan 19, 2018

xiadingZ commented Jan 20, 2018

salaniz commented Feb 6, 2018

mtanti commented May 27, 2018 • edited Loading

entalent commented Aug 29, 2018 • edited Loading

rubencart commented Feb 19, 2019

mtanti commented Feb 19, 2019

ozancaglayan commented Aug 1, 2019

HYPJUDY commented Oct 8, 2019 • edited Loading

mtanti commented Oct 8, 2019 via email

ozancaglayan commented Oct 8, 2019

HYPJUDY commented Oct 8, 2019

MarcusNerva commented Apr 1, 2020

kracwarlock commented Dec 23, 2020

kmario23 commented Dec 25, 2020

mtanti commented May 27, 2018 •

edited

Loading

entalent commented Aug 29, 2018 •

edited

Loading

HYPJUDY commented Oct 8, 2019 •

edited

Loading