Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection and classification #2

Closed
emrekeles-arch opened this issue Jun 9, 2024 · 31 comments
Closed

Detection and classification #2

emrekeles-arch opened this issue Jun 9, 2024 · 31 comments

Comments

@emrekeles-arch
Copy link

Hello, there are three questions I want to ask you.

  1. What is the purpose of the Breastclip folder? Is it a part of the project? If so, at what stage is it used?

  2. I saw a .py file in Breastclip that generates reports from the findings, are these reports obtained from the vindr dataset? Are the reports obtained used in detection and classification tasks, if so, through which files?

  3. Which method should I follow if I want to perform detection and birads-classification with the image-text data structure?

Thank you for your work

@shantanu-ai
Copy link
Member

shantanu-ai commented Jun 9, 2024

Hi,
Thanks for taking an interest in our repo.

  1. breastclip folder is the directory with all the necessary files to train Mammo-CLIP. Files like the clip model, contrastive loss, data_loader. When we create the project, we initially name it breastclip so it is there as it is. It is part of the project (core part). It is used from the beginning. If you train the Mammo-CLIP from scratch, the train.py file internally calls the breastclip folder for training. If you use our checkpoints for any downstream tasks, both the classification and detection files calls breastclip for setting up the model.
  2. The name of the file is ./src/codebase/augment_text.py. No, the reports are not obtained from VinDr dataset. We obtain the reports from our inhouse UPMC dataset. From VinDr, we use the finding-labels to generate templated-texts in the dataloader dynamically during pre-training of Mammo-CLIP. They dont play any role for classification, detection and zero-shot evaluation. They are only used for pre-training. For classification and detection, if use our checkpoints just follow the steps for classification and detection.
  3. For detection, go to this place. Then go to either Linear probe vision encoder Mammo-CLIP on target detection task for linear probe or Finetune vision encoder Mammo-CLIP on target detection task for finetuning. Linear probing means, the vision encoder of Mammo-clip is fixed. For finetuning, we finetune the vision encoder as well. Please place the checkpoints in the proper folder. Also, we include all the scripts and their utilities here. Follow the detector files.

We did not perform birads-classification, we classify density, mass and calcification for VinDr and cancer for RSNA. However, you can do BiRADS very easily. Follow classifcation steps here. If you use VinDr dataset, make sure, your csv file contains the BiRADS column. The in the --label give BiRADS and run ./src/codebase/train_classifier.py file either with linear probe or finetune mode. If you need to change dataset class, we use MammoDataset function in this file.

@emrekeles-arch
Copy link
Author

Hi, Thanks for taking an interest in our repo.

  1. breastclip folder is the directory with all the necessary files to train Mammo-CLIP. Files like the clip model, contrastive loss, data_loader. When we create the project, we initially name it breastclip so it is there as it is. It is part of the project (core part). It is used from the beginning. If you train the Mammo-CLIP from scratch, the train.py file internally calls the breastclip folder for training. If you use our checkpoints for any downstream tasks, both the classification and detection files calls breastclip for setting up the model.
  2. The name of the file is ./src/codebase/augment_text.py. No, the reports are not obtained from VinDr dataset. We obtain the reports from our inhouse UPMC dataset. From VinDr, we use the finding-labels to generate templated-texts in the dataloader dynamically during pre-training of Mammo-CLIP. They dont play any role for classification, detection and zero-shot evaluation. They are only used for pre-training. For classification and detection, if use our checkpoints just follow the steps for classification and detection.
  3. For detection, go to this place. Then go to either Linear probe vision encoder Mammo-CLIP on target detection task for linear probe or Finetune vision encoder Mammo-CLIP on target detection task for finetuning. Linear probing means, the vision encoder of Mammo-clip is fixed. For finetuning, we finetune the vision encoder as well. Please place the checkpoints in the proper folder. Also, we include all the scripts and their utilities here. Follow the detector files.

We did not perform birads-classification, we classify density, mass and calcification for VinDr and cancer for RSNA. However, you can do BiRADS very easily. Follow classifcation steps [here]((https://github.com/batmanlab/Mammo-CLIP/tree/main?tab=readme-ov-file#evaluation). If you use VinDr dataset, make sure, your csv file contains the BiRADS column. The in the --label give BiRADS and run ./src/codebase/train_classifier.py file either with linear probe or finetune mode. If you need to change dataset class, we use MammoDataset function in this file.

Thank you so much :)

@shantanu-ai
Copy link
Member

If you have any questions, do let me know. Happy coding.

@emrekeles-arch
Copy link
Author

What is the main dataset you use to train Breastclip?

@shantanu-ai
Copy link
Member

As mentioned in the paper, we use two configurations:

  1. We use inhouse UPMC dataset which contains images+report. We preprocess it to create the csv. Then finally we run ./src/codebase/augment_text.py to create the final csv to train Mammo-CLIP. The instructions are mentioned here

  2. We also pretrain using UPMC (image+text data) and VinDr (image+label). The labels will be converted to templated texts mentioned in the preprocessing steps.

BDW for classification and detection, u dont need to pretrain Mammo-CLIP. Just use our checkpoints to train the classifier or detector using linear probe or finetuning mode.

@emrekeles-arch
Copy link
Author

src/codebase/breastclip/prompts/prompts.py

What is the function of the python file located in the above folder ?

@shantanu-ai
Copy link
Member

shantanu-ai commented Jun 10, 2024

@emrekeles-arch, this file aims to generate report from the finding-labels of VinDr dataset. In Mammo-CLIP u can use any image-label dataset during the pre-training stage. For image-label dataset, using the labels for Mass, Calcification, distortion etc u need to generate the report sentences first to integrate them in the pre-training. Based on the label and laterality, u get the templated text here. prompts.py uses these texts and the labels to generate sentences for you.

@emrekeles-arch
Copy link
Author

Did you aim to create a multimodal data structure here?

@shantanu-ai
Copy link
Member

@emrekeles-arch i create the texts from labels in this function src/codebase/breastclip/prompts/prompts.py. This function is getting called from .src/codebase/breastclip/data/datasets/imagetext.py (line 200)

elif hasattr(self.df, "CC_FINDING"):
            cc, mlo = view_list
            cc_findings = ast.literal_eval(self.df[f"{cc}_FINDING"][index])
            mlo_findings = ast.literal_eval(self.df[f"{mlo}_FINDING"][index])
            text = generate_report_from_labels(cc_findings, self.prompt_json, deterministic=(self.split != "train"))
            text2 = generate_report_from_labels(mlo_findings, self.prompt_json, deterministic=(self.split != "train"))

If you want to pretrain Mammo-CLIP with an image-label data, this function will be invoked while creating the dataset class.

@shantanu-ai shantanu-ai pinned this issue Jun 11, 2024
@shantanu-ai
Copy link
Member

Hi @emrekeles-arch, if your doubts have been clarified, please close this issue. If you have more queries, feel free to ask me.

@emrekeles-arch
Copy link
Author

Hello again, I want to do a new training using your model's checkpoints. The dataset I have consists only of images, but I want to do a multimodal training. Would it be useful or unnecessary if I created sentences from the information in the csv file using your prompts.py file, matched each sentence with an image and sent it to the training? Since you are an expert, I wanted to get your opinion.

Thank you for your patience and nice replies.

@shantanu-ai
Copy link
Member

shantanu-ai commented Jun 14, 2024

Hi @emrekeles-arch, if you want to further pretrain with your dataset (after initialized mammo-clip with our checkpoints), you can create sentences and add them during the pretraining. However, i think this training has a potential that mammo-clip may forget its knowledge from pre-training with image+text data (which we have from UPMC) because you only have image data. Your texts are templated not real one. We have not done such experiment where we train with image+text and then further pretrain with image + templated text data.

Just to note how we pretrain using UPMC (image+text) and VinDr (only image)? Here are the steps for every epoch:

  1. We first mix both the datasets. Suppose UPMC has 100 samples and VinDr has 50 samples. Now we have 150 samples total.
  2. For each minibatch, we randomly sample images. If the images coming from UPMC (image+text), we use the texts as it is. If it is coming from VinDr (image only), we generate texts from the prompts.py file based on the finding labels of vindr. Suppose in a minibatch size of 16, 10 comes from UPMC and 6 comes from VinDr, we generate texts the 6 from VinDr. Once we have texts ready for all 16, we get embeddings and do contrastive as mentioned in the paper.

Hope it clarifies your question.

@emrekeles-arch
Copy link
Author

Do you have the opportunity to share the entire inhouse dataset you have?

@kayhan-batmanghelich
Copy link
Contributor

kayhan-batmanghelich commented Jun 15, 2024 via email

@emrekeles-arch
Copy link
Author

The in-house dataset cannot be shared for legal reasons. Sent from mobile phone. Sorry for misspellings and abbreviations.

On Sat, Jun 15, 2024 at 10:59 AM emrekeles55_ @.> wrote: Do you have the opportunity to share the entire inhouse dataset you have? — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC53JXOFKCAZG7HW65LUJZ3ZHRJG5AVCNFSM6AAAAABJBDMBN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRZHA2DKMJUG4 . You are receiving this because you are subscribed to this thread.Message ID: @.>

I had guessed, I still wanted to take my chances. Thank you for your reply

@shantanu-ai
Copy link
Member

shantanu-ai commented Jun 15, 2024

@emrekeles-arch i have attached a dummy version of the inhouse dataset here. This is the image+text dataset used in pretraining.

The texts are dummy texts or templated texts and the patient-ids are also random numbers. But the structure of the csv file you need to pre-train Mammo-CLIP is the same with the dummy one. We use it to pre-train Mammo-CLIP.

@shantanu-ai shantanu-ai unpinned this issue Jun 19, 2024
@emrekeles-arch
Copy link
Author

@shantanu-ai, How did you get the resized coordinates?

@shantanu-ai
Copy link
Member

shantanu-ai commented Jun 23, 2024

@emrekeles-arch , i preprocessed it to resize it. The resized co-ordinates will be found here

@emrekeles-arch
Copy link
Author

@shantanu-ai, I know I asked a lot from you and you helped me a lot, I am grateful for all of them. I would like to ask one more thing, I have an external dataset and I want to resize its coordinates, can you share with me which technique you used to resize your own dataset?

@shantanu-ai
Copy link
Member

@emrekeles-arch , please use this file

@emrekeles-arch
Copy link
Author

emrekeles-arch commented Jun 26, 2024

@shantanu-ai, While training the dataset I have, mAP values ​​are constantly 0.000 in every epoch. The coordinates in the CSV file are as follows: "668.3909,114.3665;668.3909,274.6058;884.4713,274.6058;884.4713,114.3665" these represent the coordinates of 4 different points. From here I subtracted the xmin, ymin, xmax and ymax values ​​and resized them according to the cropping ratio of the image. During labeling, the origin (0,0) point was determined as the center of the image. Can you give me an idea as to why the mAP values ​​are always 0 ?

Additionally, when I run the preprocessing code you presented in my own csv file, even though there is a 'No finding' class, values ​​are assigned to the resized_xmin, resized_ymin... columns as a result of resizing. Normally, these values ​​should be 0,0,0,0. I couldn't understand why.

Ekran görüntüsü 2024-06-26 171238

@shantanu-ai
Copy link
Member

shantanu-ai commented Jun 26, 2024

@emrekeles-arch , couple of points:

  1. Did u use the last file that i uploaded to adjust the bbox coordinates? During preprocessing, i extracted the breast region from the mammograms. So, the bounding boxes has to be adjusted accordingly. If you just rescale and readjust that won't work here.

  2. I did not include the no_findings labels of VinDr. I only use the samples with mass and calcification in the paper. So, the code here only uses the samples without no-findining labels. You can find i was selecting the first 2000+ rows which have atleast one findings. However, if you want to include them, just set the coordinates as 0. Thats why they are empty for no_findings in this file. I also trained with them, results wont differ much.

@emrekeles-arch
Copy link
Author

@shantanu-ai, How do I use the Resnet backbone with Retinanet in object detection task?

@shantanu-ai
Copy link
Member

U will find a plethora of examples, here is one which we used in many of our papers: https://github.com/yhenon/pytorch-retinanet

@emrekeles-arch
Copy link
Author

@shantanu-ai, When I try to train with FasterRCNN, I get the following warning, I couldn't figure it out. Do you have any ideas on how I can solve it?

targets should not be none when in training mode

@shantanu-ai
Copy link
Member

U probably use the targets from the csv as none for No findings. U need to set it to 0, 0, 0, 0. Also, did u try the efficient net and follow the instructions. It is advisable to use our model.

@emrekeles-arch
Copy link
Author

@shantanu-ai, Yes, I have tried efficientnet, but I also want to train and compare with different models. Like a swin transformer or a vision transformer.

I don't get a target nan warning when training with Retinanet, but it becomes a problem when I switch to FasterRCNN.

I will try and check again on your suggestion, thank you very much.

@shantanu-ai
Copy link
Member

U dont get the nan for retinanet because i handled it in the code. I did not try fasterRCNN though bcz retinanet is a goto detector for medical images.

@emrekeles-arch
Copy link
Author

Are there any other models that I can use in the backbone part of Retinanet other than Resnet and EfficientNet? Can you make a suggestion?

@shantanu-ai
Copy link
Member

I think any CNN model can be used like Densenet121. For ViTs, u need to search.

@shantanu-ai
Copy link
Member

Closing this issue. If you have any queries feel free to open one.

@shantanu-ai shantanu-ai pinned this issue Jul 3, 2024
@shantanu-ai shantanu-ai unpinned this issue Aug 28, 2024
@shantanu-ai shantanu-ai pinned this issue Aug 28, 2024
@shantanu-ai shantanu-ai unpinned this issue Aug 29, 2024
@shantanu-ai shantanu-ai pinned this issue Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants