Benefits of the explanation

Downsides of using the Explanation class

(0:00 - 0:25)
So, as you are going through the course, you might be asking, why do we even need the explanation class? That is why do we need to ask for an explanation, because you can see that it has a few problems right. The first thing is that it substantially increases the total number of tokens in the output. So, if you have like a simple Boolean value coming back, let us say we take the example of the patient was disabled right.

(0:25 - 1:07)
And the actual answer is just a simple yes or no or true or false ok, but when you look at the explanation object corresponding to that variable, you have first of all the explanation string right, which is the LLM's understanding of why it shows that value. And then you have a matching phrase, you have a matching sentence, which means that the information is also quite repetitive right. You extract both matching phrase and the matching sentence, and of course, the matching phrase is supposed to be a subset of the matching sentence right, which means that you have a lot of tokens coming back, which have the same value.

(1:07 - 1:57)
And then you have the same sentence could be matching multiple different fields right in some cases, and that is also repetitive, you have a lot of same tokens, the same set of tokens in the same sequence coming back in the response ok. And what you might be asking is, why do we have to ask for an explanation, if it could increase the overall cost of the response coming back from the LLM APIs right. So, that is a very valid question to ask, and I am going to show you with some examples why that is going to be still a pretty good option, because of all the other things that are going on, when you are doing the simple call with this explanation class.

Explanation provides a future reference

(0:00 - 0:59)
So, the simplest reason for using the explanation class is as a future reference and in this case what I think is asking for the explanation is a bit like using logging when you are writing code and the explanation in this way it serves as a debugging tool and one that you can go and check later you are going to save it to some kind of a JSON file as I have shown you before and a good example of this is where the age was being returned in months rather than in years in one of the previous lessons. So, the age value went above 1000, but when you go and check the explanation it was pretty clear that it was very clear why the LLM gave that response because it was returning a value in months and not in years even though all the other values coming back were in fact returned in age in returning the age in years. So, that is a good example because you used it as a sort of a way to log the information.

(1:00 - 2:48)
The other thing is that it also allows you to check for hallucinations which is what you are usually trying to avoid and I will say in the case of getting structured output and also in the case of the way we have constructed the whole thing right where we have a very rigid structure that we wanted to follow right. I am going to say that the odds of getting actual hallucinations are pretty low there are some cases when GPT or Gemini is making a mistake in the inference, but it is I would not call it an actual hallucination and also I will even go so far as to say that most of the time these hallucinations are most of the time it only happens with the natural language generation part and in our case that is not what we are focusing on we are focusing on the natural language understanding part. So, I do not think that there are too many chances of these hallucinations happening.

Now, I will say that you may still want to be on the lookout for these hallucinations ok and when a good example of something like that is you are expecting some value and the actual number that you get is not only different, but you do not see it anywhere in the input prompt that you provided right. In that case you will have to be a bit more concerned and I am not seeing these hallucinations till now with the stuff that I have seen till now which does not mean that they do not happen, but I am just saying that it seems to be pretty solid from what I have seen till date. So, that is something that you might want to consider.

Explanation can speed up annotation for spaCy Prodigy

(0:00 - 0:19)
One more reason to get this explanation as well as the very specific structure that I use for the explanation is that it can serve as a good input for building custom SPACY machine learning models. Now, SPACY is a natural language processing library that you might have heard of it. It is actually pretty famous.

(0:19 - 0:42)
If you are going to be working in this field, I would say that you probably need to know about it. And you may sometimes not want to send your entire data set to an LLM because of either price reasons or sometimes it is even because of performance reasons. You have a really huge data set and if you want to make like 1 million calls to the API that could take a substantial amount of time.

(0:42 - 1:24)
And you may want to do something which is preferably in the few minutes and that means that in that case LLMs are not a good choice. And of course, price could also be a problem in some situations where you have a million records and if you want to send each of them, it could have a lot of tokens too and that cost could add up pretty quickly because of the paper token pricing that you have for the large language models. So what you can do? You can get a subset of the data annotated using an LLM like what we are doing here and then use this information to develop a custom machine learning model with SPACY.

(1:25 - 2:24)
And the explanation object that we are constructing, because it has the matching phrase and it also has the matching sentence, it will allow you to speed up the annotation process within the Prodigy tool that SPACY is selling. They sell Prodigy, SPACY itself is free, but their machine teaching tool which is called Prodigy is a paid software and all this is outside the scope of this course. But what I want you to remember is that by getting all this additional information, it actually makes it much easier for you to use that as a sort of like in the next step of your ML model, if you have like a pipeline, you can think of it as like the pre-processing stage or maybe the first step where you get all this training data in a format which is very easy to use in your next step which is within the Prodigy tool.

Explanation can provide more accurate responses

(0:00 - 0:29)
One more reason to use this explanation class is just to get better answers. Now this is a bit of an intuition on my part but the thing that I think is if you are forced to provide an explanation for each information that you are extracting, each piece of information that you are extracting that does two things. First of all it acts as a sort of a double check for the LLM itself to see if it got the information correct.

(0:29 - 1:21)
Now I am not sure if that that will really make a big difference but there is another way that it helps which is that it ensures some kind of an internal consistency because if the LLM is explaining why it shows a certain value for a certain parameter then it cannot go to the next sentence and choose something which could mean the exact opposite. So there is an inherent constraint that you are placing on the LLM as it is trying to extract the information. So what I think is being forced to provide an explanation seems to improve the answers provided by GPT and Gemini and I will also show you an example of this in the next lesson so that you can get a clear picture which is you can use a data to sort of validate what I am saying, you do not have to just go by intuition.

Better responses: an example

(0:00 - 0:15)
Alright, so let us go for a very clear cut example, okay. So, what I am going to do is I am going to extract outcomes, those are the Boolean values that we saw in the previous chapter. I am going to extract outcomes without getting the explanation using GPT, okay.

(0:16 - 0:41)
So, what we will do is we will rewrite the code. As you can see there is no explanation class in the code anymore and you just have these four bool values and as before I am going to take this file name called GPT outcomes map no explanation, that is where I am going to save the information. I am going to iterate through all the 100 reports in the CSV file and then once it is done I am going to save it to that file name.

(0:41 - 6:21)
So, I have already done this particular step, so you do not have to watch me run this code during the video. Now, what I will do next is we have two different JSON files. The first one is called GPT outcomes map, the second one is called GPT outcomes map no explanation dot JSON, right.

So, what you will do is you are going to take these two values that you get for each of these parameters and here the parameter I am looking for is just the disabled value, the value for the what it gets as the disabled. Remember that I had already shown you getting the value for the disabled is a very subjective sort of inference and it is not clear cut just by reading a VAERS report whether or not the person got disabled. So, this will be a good test case for this particular data that we are trying to get, all right.

So, now let us run this and what I have here is CSV value, the disabled value in the CSV file, disabled value coming from the CSV file and the disabled value coming from the LLM without explanation, disabled value coming from the LLM with explanation, disabled explanation corresponding to this value and then the disabled sentence corresponding to this value and then you finally have this parameter called mismatch which is going to check as you can see here it is going to check if the CSV value is different with the LLM disabled or the LLM disabled explanation, ok. And we are going to check to see if the mismatch is equals true. So, what happens here is the CSV file value is different from the LLM file value which is different.

So, the CSV and the explanation LLM agree, but the one without explanation disagree, ok. What is the actual answer? The reporter ticked risk of disability, but there is no indication that the patient was actually disabled. So, what this means is at the time of reporting if there was no indication that the patient was actually disabled, ok, like you can see this is what the sentence says.

So, the one without the explanation marks it as true, but the one with explanation correctly infers it as false. Now, let us look at the next one. CSV is false, this one explanation with the one without explanation is true, the one with explanation is also true.

So, these two are agreeing in this case. So, we do not want to look at this. In fact, what might be a better thing to do? Let us see there are 12 entries.

What might be a better thing to do is ignore this mismatch. Instead what we might want to do is the LLM explanation value, ok, equals true and then LLM disabled value equals false. Let us see when this happens.

You can see that there are three cases and here this is the first one that we already saw, but then there is another one where it says the patient died following the vaccination. So, disability is not applicable. That is how it reasons that the patient did not get disabled because they died and it is correct to say LLM disabled is false, but this you can see that the one without explanation gets it as true and then you can see that there is another report which has pretty much the same property.

The patient unfortunately passed away post vaccination. So, the term disabled is not applicable here. So, notice what happens when you force it to get an explanation.

You can see that the value that it is actually inferring is getting better. So, now let us change this to false and then let us change this to true, ok. So, again these two disagree.

So, it thinks that there is no disability without the explanation. With the explanation it thinks there is a disability and the reporter other health classified events can lead to disability. Now, this is again pretty ambiguous and maybe you might say that the one without explanation got it right in this particular case, ok.

Now, what about the next one? The disability without explanation marks it as false, with explanation marks it as true. The patient experienced temporary disability, could barely walk due to arthralgia and again this is a one more example where this is it could be right or it could be wrong. So, you may not be able to choose between the two values.

What about this one? The patient was absent from work more frequently suggesting some level of disability, right. The symptoms had continued for a long time, patient was absent from work more frequently and with the explanation it was able to infer it is that there was a disability, without the explanation it says that there was none. I think I am going to go with this one.

I think that the one with the explanation got this answer correct. Now, what you can infer from this overall like looking at a bunch of values is that as soon as you ask for an explanation, it is you can see that it is sort of forcing the large language model to put a bit more reasoning into whatever value that it is extracting from the information. And I think that for the most part I am going to say that it works out to make the information itself more accurate.

So, not just based on our intuition, but even just looking at a specific example, you can see that asking for the explanation does seem to increase the accuracy of the information which is extracted by the large language model. Now, this is something which applies to GPT. I have not used Gemini in this yet and I will talk about that in the next lesson.

What we can infer from the quality of GPT and Gemini explanations

(0:00 - 1:03)
Alright, so in the previous lesson, I showed you using GPT an example where I extracted information for the disabled value without any explanation and then I also extracted the value with explanation and I showed that the one which comes back with the explanation tends to be more accurate okay. Now what I have done is I am not done the same for Gemini, I am going to leave it as an exercise for the student and first of all, all you have to do is go and look at all the other code and do a bunch of copy paste and it should be easy for you to implement it and also test your knowledge to see how well you have done. But I also did not do it for one more reason because what I found was that just my personal opinion was that GPT was providing better explanations and I do not want to make that sort of verdict for this at this point because I am not sure that that is always true and I also think that it is a bit subjective.

(1:03 - 1:52)
In my view, at least what I saw was that GPT's explanations were a bit more let us say verbose and more I would say that it tended to be more descriptive okay and this of course means in some sense that it is consuming more tokens which does incrementally increase the cost just a little bit but it is still probably preferable because it makes it very clear why GPT is choosing a certain value. Even the example that we saw in the previous lesson, you could see that it was making a pretty good effort to explain each and every value that it was selecting okay. So, my intuition based on what I have seen till now is that GPT can probably also do a better job of extracting the structured information.

(1:52 - 3:09)
In other words, based on the fact that it is providing better explanations, I even think that it will be doing a better job of even the extraction of the value okay and I just like I am just saying that again it is a bit of an intuition but I am just saying that based on how you might you might see like without being able to give a good reason it is a bit like you know doing the steps when you are doing the calculation versus trying to do it in your mind and skipping something what is more likely to be correct is when you are going to go through it step by step. So, that is the way I think of this whole thing when I am looking at what GPT is providing and what Gemini is providing, I noticed that GPT seems to be more rigorous in the way it is doing the in the way it is even providing the explanation. So, this is why I want the student to do this task for themselves and check it and see if Gemini is doing better or worse than GPT but just based on what I have seen till now I think that GPT seems to have better explanations and based on that I am also going to conclude in some ways that GPT is doing a better job of extracting structured information.