message = """\ You are a helpful doctor's assistant and a data entry expert filling in forms based on patient documents to make sure all information is stored correctly to allow for proper healthcare. You are meticulous and thorough and understand how patients' well-being is based on correct information. You will receive task instructions, a document chunk marked as document-chunk and a JSON schema marked as first-consult-form-schema defining a first consult preparation form. You must perform this task in steps, as laid out in the following instructions, where each step will be surrounded by the respective XML tags: 1. Facts 2. Aggregated facts 3. Relevance 4. Preparation form Your response must to be in the form of: [free-form JSON] [free-form JSON] [free-form JSON] [JSON compliant with the first-consult-form-schema] Don't skip any step. Don't stop until you have finally closed the response tag, and do not write anything after closing the response tag. Start your response by writing a tag . The available documents which you will process one by one are as follows: ['referral letter - processing\n']. The following is the chunk number 1 " "out of 2 for this document. The current document id: referral letter The next document chunk to use to extract the facts, answer the questions and fill up the triage form JSON is: Dr. R.O. Bert General Hospital Amsterdam 123 Main St, Amsterdam Phone: 123-456-7890 Email: johndoe@email.com De Heer Jansen NKI-AvL Hospital 321 Pine St, Amsterdam Phone: 789-456-1230 Email: jansen@email.com June 6, 2023 Dear De Heer Jansen, Re: Referral for John Doe I am writing to refer John Doe, a 55-year-old male patient, to your care. He has been experiencing chronic ear pain and I believe a specialist in head and neck, such as yourself, would be best suited to evaluate and treat his condition. Patient History: - John Doe is a smoker. Medical History: - No significant medical history. Medications: - John Doe is currently taking insulin for diabetes. Allergies: - John Doe has a known allergy to penicillin. Reason for Referral: - Chronic ear pain. Timeline of Events: - January 1, 2022: Patient started experiencing chronic ear pain. - January 15, 2022: Patient visited Dr. R.O. Bert at General Hospital Amsterdam. - January 20, 2022: Patient underwent initial examination and tests. - January 25, 2022: Lab results received, indicating a PSA level of 7.2. - February 1, 2022: Imaging results received, showing suspicion of a prostatic tumor. - February 5, 2022: Referral letter sent to De Heer Jansen at NKI-AvL hospital. Please find attached the relevant lab and imaging results for your review. If you require any additional information or have any questions, please do not hesitate to contact me. Thank you for your attention to this matter. I trust that John Doe will receive the best possible care under your expertise. Yours sincerely, Dr. R.O. The document chunk is from a document which is OCR scanned with OCR errors. It is chunked in manageable pieces. The document can be either a letter, an email, or a report document, and it can be a concatenation of emails, letters, reports and other documents. It might not be completely clear where the concatenated emails and letters actually start and end, but do your best! If you have some facts in the above partial answers for some fields you have in the JSON schema, please, duplicate these answers to the JSON Schema fields you are filling. It is more important to have these answers in the defined JSON Schema fields than in free-form fields added outside the schema. Be conscientious in filling up each of the fields defined in the JSON Schema, because answers from these go to the final form to be submitted. Let's start with the facts section: Write . First, let's make sure we see and understand every separate fact in the document-chunk. Write a verbose and complete JSON which contains *every* fact from the document-chunk separated into separate properties. Be thorough here, by "every fact" I really mean every fact, irrespective of importance or length. Don't try to be terse here, but don't write the whole chunk here either. Extract facts only from the document-chunk, not from elsewhere in this text. Your response here is a free-form JSON, do not follow the given JSON Schema with this. The JSON Schema given later is for the preparation-form part later. This is for your support to make later steps easier. Be sure to include every fact, so that you have the best possible basis to perform the next steps! A fact is a value of something, for example: "allergy: penicillin". Feel free to use any convenient syntax, but for example this basic structure would be fine: {{"allergy": "penicillin"}} I know you have a huge temptation to end the response in the middle of a birth date, a phone number, an address or any other number, but that is not a normal thing to do, and you absolutely should produce a complete JSON. Your answer shouldn't stop or cut off before you have closed off the task response enveloping XML tags! Then write to end this section and move on to the next step. Then write . Here you need to look at the facts you extracted, and transform them into aggregations like lists. For example, if you have two facts like "x-ray result": "2 shadows in the lung", "x-ray date": "2023-12-01", "echo date": "2023-12-02", and "echo result": "density in the chest area", you should combine these into something like: "imaging results": [ {{"type": "x-ray", "date": "2023-12-01", "result": "2 shadows in the lung"}}, {{"type": "echo", "date": "2023-12-02", "result: "density in the chest area"}} ] Not all facts can be combined like this, for example patient name and referral doctor name should stay as individual facts. But all sorts of test results and such should be aggregated. Make things like "allergies" lists, and in general consider whether the fact should be a list or not. Things like "allergies" should be a separate list from things related to patient history. In general, make these collected facts better structured so that they map better to paper form-like structure. Then write . Write , and then write a JSON which matches the JSON you produced in the previous step, but now includes more information. Do this in English, and translate values and facts to English where necessary. This is a free-form JSON, do not follow the given JSON Schema with this. In this JSON, for each fact mark two things: 1. How surprising this fact will be to a doctor from least surprising 1 to most surprising 5. 2. How much relevance will this fact have to the first consult meeting with the specialist, or to the eventual carepath from least relevance 1 to most relevance 5. Surprise measures how different this fact is from average patients with this kind of a referral. Relevance measures how much effect this fact has to the subsequent treatment of the patient. Try to use every value from 1 to 5, because you want to determine relative, not absolute importance of items. Relevance and surprise will determine the most important items to highlight to the specialist, to the treating doctor eventually, because the most important and the most surprising items need to be made sure to be seen. Feel free to use any convenient syntax, but for example this basic structure would be fine: [{"fact": "allergies: penicillin", "surprise": 4, "relevance": 5}] Then write to end this section. Now, let's integrate this information to a backend system which provides all the case details to the doctor in a standard form. Write to start this part. Each property defined in the first-consult-form-schema defines a form field. The first consult preparation form is sent to the treating doctor for the first consult and this form is the basis the doctor uses to treat the patient. Make sure it is perfectly filled! Each field answer can go to a different use, so be redundant where necessary. Only with fields where it is explicitly mentioned that you shouldn't repeat the same information given elsewhere you should avoid that; otherwise, repeat the same information in every field where the information is relevant. Your task is to fill out this form to the best of your ability, without missing any clinically relevant details. Missing details or summarizing too heavily will endanger the patient. Make sure you include everything in the form! You will need to consider each of the first-consult-form-schema form fields and specifically their descriptions in order. First consider the question or the instruction in the JSON schema property "description" for the form field. Consider whether the answer to this can be found in the chunk of the document. The patient well-being and the quality of care is of utmost importance, so be meticulous and thorough, the patient's life may depend on it! When you provide an answer, designate the part where this information can be found with an excerpt which matches only to that part of the document in the property "sourceReferenceExcerpt". This excerpt must be the text excerpt mentioning the fact verbatim escaped appropriately. You can use regexps here to make the reference cleaner. These references will be graphically presented to the user, so that the part which excerpt matches to will be highlighted like a quote. Hence, don't refer to the entirety of the document-chunk but only to the specific part where the answer to the form field can be found by crafting a excerpt which matches to that part exactly. Mark the document from which the excerpt is from to the "sourceReferenceDocumentId" property. This is important because doctors and nurses will use these properties to verify the information. Then consider how this part answers the first-consult-form-schema field, and write your thoughts into the "notes" property. Use the provided notes property in the schema to reason step by step about your answer before you fill in the answer property. And after that, consider the document-chunk, the part of it where the answer was found, your thoughts in the "notes" property and then write the answer to that form field to the "answer" property. Then continue with the subsequent form field and its description. You can work on the form in this sort of a sequential order conveniently. After having filled up the form, you should go back and check if there is still something clinically relevant in the document-chunk which wasn't already filled up to the form by you. Leaving clinically relevant information out can be very dangerous to the patient. Go through the fields in this form one by one, check the descriptions of each property for instructions on how to fill them. Check the document-chunk if there is an answer for the field there, and if so, write the notes, source and the answer to the field. JSON has properties with object typed values containing "answer" properties. We call these object properties "fields", which contain "sourceReferenceDocumentId", "sourceReferenceExcerpt", "notes" and "answer". A field is a form field which has an answer which is written down to the actual form field, and all the reference information about the answer. In other words, a field is a special kind of a JSON property. There are special properties for every form field to allow us to reference where the information came from: - sourceReferenceDocumentId: Designate the document id from which document the information was found in and which document the below excerpt is from. - sourceReferenceExcerpt: Use this field to reference a specific part of document text where you found the answer. This must be a verbatim excerpt which only matches to the part of the document where the information is extracted from. This will be used to find the part of the document with the answer in order to verify your answer. You must fill this if you fill in an answer to the field, because an answer without a source is worthless. The subsequent 'answer' field needs to have the correct answer to the form field which doesn't necessarily exist in the referral letter in the same exact form. Escape white spaces and line feeds correctly as they should be escaped in JSON strings. You can also use regex here to make it easier to match to excerpts with e.g. long white space sequences. Make sure you have a valid JSON string here even when the excerpts span multiple lines. - notes: Use this field to explain your answer and thinking before filling in the answer. Do also add the same information to the subsequent answer property as well. - answer: This property is the actual text written to the form field. It can be either string, a string which must be a date, a string with enumerated options, or a number as it is defined in the JSON Schema. Always write the developmentFeedback property the last, because its contents depend on the contents of the other properties. { "$schema": "https://json-schema.org/draft/2020-12/schema", "type": "object", "properties": { "patientAllergies": { "title": "Allergies", "description": "This field lists all the allergies the patient is suffering from including allergies to medications, unspecific allergies, allergies to foods and e.g. environmental factors. Look at the aggregated-facts and list all the patient's allergies mentioned into this field. It is extremely important to list all the allergies to medications, because some allergies correlate across many substances and modern cancer treatment can use surprising compounds in research and clinical trials. This field is read by different people and systems to warn if a prescribed medication might have an allergy risk to this patient. Put allergies here as well even if they have been already mentioned in other fields. Even non-specific allergies like 'medication allergy' belong here. Note that this is an array where 'items' defines the structured items inside a plain array. Answers inside the item object structure are plain strings. Put each allergy in a separate item in this array.", "type": [ "array", "null" ], "items": { "type": "object", "properties": { "sourceReferenceDocumentId": { "description": "See reference-property-advice.", "type": [ "string", "null" ] }, "sourceReferenceExcerpt": { "description": "See reference-property-advice.", "type": [ "string", "null" ] }, "notes": { "description": "See reference-property-advice.", "type": [ "string", "null" ] }, "answer": { "type": "string" } } } }, "developmentFeedback": { "type": "object", "properties": { "notes": { "description": "Use this field to explain your answer and thinking before filling in the answer. Do also add the same information to the subsequent answer property.", "type": "string" }, "referralCaseFeedback": { "description": "This is a compulsory item. Be as strict as Dr. Gregory House and be very critical of the referral letter and other attached documents. Were they clearly written? Is it clear who is the referrer and referree? Were multiple letters and emails concatenated or mixed up together in an unclear way? Were unclear expressions used and if so, where?", "type": "string" }, "jsonSchemaFeedback": { "description": "This is a compulsory item. Be as strict as Dr. Gregory House and be very critical of the instructions and the process with the aim of improving the process. As Dr. Gregory House, review the process: Fill here your critiques about this JSON Schema. Focus on whether it is valid JSON Schema or not, point out all the typos, mistakes and unclear expressions. Also check your own output so far whether it is correct against the schema, especially in relation to arrays. List out all the dates in the responses which aren't valid ISO 8601 dates when the property value needs to be a date. Make sure each item in the lists have some contents in the answer properties, so that the list item doesn't show up completely empty or filled with 'N/A' contents in the database. List places where the answers are empty, and should not be. This won't go to the doctor but to the developers of this application as a feedback. Be very thorough here as well! Point out any confusing parts as well, and offer improvement suggestions! Consider this from the point of view of patient safety and data consistency as well. You can put here a long multi-line document. Note that this is for the feedback from you, the assistant filling up the form, not feedback given in the patient referral letter.", "type": "string" }, "medicalAccuracyReview": { "description": "This is a compulsory item. Be as strict as Dr. Gregory House and be very critical of the instructions and the process with the aim of improving the process. As Dr. Gregory House, review the process: Fill here your critiques about the documents. For the referral letter, and other documents, focus on realism of the document, and whether the case is realistic and realistically described. Check that all fields for which there is information available are actually filled, and point out if that doesn't seem to be the case. For example, check that the allergies field is filled if any allergies are mentioned, and all the tests and imaging tests are in the appropriate JSON fieldsif they were mentioned in the facts or aggregated facts. Check that all information is put to the correct fields, and write here critiques of choices made. Check that all medical details are correct against the patient case document contents. Check that 'answer' fields contain the whole answers so that they don't have placeholders, and that they can be seen in isolation without the notes and the reference properties, without losing important context. Check things like allergies aren't marked up as medications, or any similar confusions. Write any criticism and findings related to those aspects here. This won't go to the doctor but to the developers of this application as a feedback. Be very thorough here as well! Point out any confusing parts as well, and offer improvement suggestions! Consider this from the point of view of patient safety and data consistency as well. You can put here a long multi-line document. Any worries, development suggestions, general feelings go here as well. Note that this is for the feedback from you, the assistant filling up the form, not feedback given in the patient referral letter. Consider carefully that each form item was filled to the best of your ability. You might be held liable if the patient care was compromised if some relevant information was lost and not included in the First Consultation Form!", "type": "string" }, "jsonStructureReview": { "description": "This is a compulsory item. Be as strict as Dr. Gregory House and be very critical of the instructions and the process with the aim of improving the process. As Dr. Gregory House, review the process: Check that the JSON you wrote is perfectly valid. Write here all possible JSON validation errors that. might happen, including unmatched brackets and braces, invalid strings, and cut-off responses.", "type": "string" } }, "required": [ "notes", "jsonSchemaFeedback", "medicalAccuracyFeedback", "jsonStructureFeedback" ] } }, "required": [ "developmentFeedback" ], "additionalProperties": false } Be careful to produce exactly valid JSON with all the curly braces matched, and no "..." for pretty printing or stylished quotes or anything like that. Fill up the array type fields correctly, as per JSON Schema definition. Note especially that there are no "sourceReferenceExcerpt", "answer", or "notes" properties for the whole array fields, but only for the leaf items contained within. The type of the "answer" property is ALWAYS a plain string. Array types in JSON Schema work as follows. The following example JSON Schema excerpt designates an array of strings and nulls: ``` { "type": "array", "items": { "type": ["string", "null"] } } ``` For example the following example array would be compliant with that example JSON Schema: ``` ["2025-05-01", null, "cat"] ``` Now let's look a more complicated example. In the schema you'll have things like this: ``` "pathologyResults": { "type": [ "array", "null" ], "items": { "type": "object", "properties": { "location": { "type": "object", "properties": { "sourceReferenceDocumentId": { "type": [ "string", "null" ] }, "sourceReferenceExcerpt": { "type": [ "string", "null" ] }, "notes": { "type": [ "string", "null" ] }, "answer": { "type": [ "string", "null" ] } } }, "date": { "type": "object", "properties": { "sourceReferenceDocumentId": { "type": [ "string", "null" ] }, "sourceReferenceExcerpt": { "type": [ "string", "null" ] }, "notes": { "type": [ "string", "null" ] }, "answer": { "type": [ "string", "null" ], "format": "date" } } }, "tNumber": { "type": "object", "properties": { "sourceReferenceDocumentId": { "type": [ "string", "null" ] }, "sourceReferenceExcerpt": { "type": [ "string", "null" ] }, "notes": { "type": [ "string", "null" ] }, "answer": { "type": [ "string", "null" ] } } }, "result": { "type": "object", "properties": { "sourceReferenceDocumentId": { "type": [ "string", "null" ] }, "sourceReferenceExcerpt": { "type": [ "string", "null" ] }, "notes": { "type": [ "string", "null" ] }, "answer": { "type": [ "string", "null" ] } } } } } } ``` The following JSON is compliant with the above example JSON Schema excerpt: ``` "pathologyResults": [ { "location": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" }, "date": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" }, "tNumber": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" }, "result": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" } }, { "location": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" }, "date": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" }, "tNumber": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" }, "result": { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" } } ] ``` Note that in the above structure the top level array contains complex objects, and those complex objects themselves contain four properties and the final answers are put to the respective "answer" properties along with the sources and notes for that answer. The "answer" property is the most important one. It is the contents of the form item you are filling. If you don't have anything for the answer, leave out the whole item. The final form will only have the "answer" property value visible, the other properties are for supporting functionality only. If there are no applicable test results or history, just output an empty JSON array for such case, like this: ``` "Pathology results": [] ``` Similarly for example for Co-morbidities you might have a JSON Schema: ``` "coMorbidities": { "type": [ "array", "null" ], "items": { "type": "object", "properties": { "sourceReferenceDocumentId": { "type": [ "string", "null" ] }, "sourceReferenceExcerpt": { "type": [ "string", "null" ] }, "notes": { "type": [ "string", "null" ] }, "answer": { "type": "string" } } } } ``` For this schema, the following JSON would be compliant: ``` "coMorbidities": [ { "sourceReferenceDocumentId": "[put a real source document id here]", "sourceReferenceExcerpt": "[put a real source excerpt here]", "notes": "[put real notes here]", "answer": "[put real answer here]" } ] ``` Note how that's a bit simpler, because inside the array you only have the form answer objects themselves. Regardless, the form answer objects are a bit complex as they have the source references and notes in addition to the answer which is the text you actually write to the form field. The schema you will need to respect will include array types in it, be sure to fill those up properly. Typically you will encounter arrays where items are complex objects, not just primitive types. It's complicated, but I know you can do it without any mistakes! Be super pedantic to match all the brackets and braces properly. Typos in those are extremely bad! Now, let's make sure the patient will receive the best possible care by filling up the first consult form perfectly so that the treating doctor will have all the information they need to treat this patient! Your response must be only valid JSON and must conform exactly to the triage-form-schema JSON Schema. Look at the "description" fields in the schema for instructions for filling the respective field. Make sure all fields are filled according to information given in the document-chunk. Be conscientious to add ALL clinically relevant information from the document and the extracted facts, so that the treating doctor will get everything they need from the form! Do not use references to the answers from the previous chunks using constructs like: `...answersFromPreviousChunks` The JSON parser doesn't support this; instead you need to repeat the answers verbatim. Note that the root level of the above schema is an object, not an array. Look closely if you need to close the braces whether they are supposed to be curly braces or brackets; obviously braces must match the starting braces in all JSON. After having written the whole JSON, write . Do not stop your response before closing the response tag; your response is complete only after closing the preparation-form and response tags. This must be the final thing you write, write nothing after it. Do not worry about the length of the answer, by necessity you will have to write a really long document as an answer. """ messages = [ {"role": "user", "content": message}, ]