Interesting topic, and difficult project.
You actually have a very large corpus, even if you don't choose to use all of it. One way that some research is done is to just use a selection of results. This can be done using an automated search or a manual search. So you could simply read ALL of the transcripts until you find 10 or 100 or 1000 examples, and then stop. That wouldn't be comprehensive, but it would be randomized in a way so it would still be representative (at least of the section of texts you looked at).
I can't think of any other obvious ways to get examples like this because other searching methods will probably be biased toward either one type of metaphor or one type of reply. (For example you could look for instances of the word "metaphor" for metalinguistic commentary, but that would limit your results and discussion to only instances where the person replying was very direct and even technical in the commentary. Or you could try to find one speaker who used metaphors often and look for examples just from their speech, but that is also not representative, unless that in itself would be interesting if it was some relevant historical figure, for example.)
I think there are four reasonable approaches here:
1. Choose a small sample (either a sub-section of the corpus, or better just whatever it takes to reach a certain number of examples). [I have seen published research like this, even for less difficult phenomena to identify like a syntactic construction.]
2. Find any examples and then extrapolate until you find more similar examples. This would be fastest, but it wouldn't be very representative because of what you might miss.
3. Spend a LOT of time going through a LOT of data, or get funding to hire research assistants to do the same, etc.
4. Use a shortcut of political commentary or other source that helps you identify metaphors. Maybe there are some existing sources that document some instances. Still, it would be hard for these to be representative.
I'm assuming that (1) will be the best option, unless the others stand out to you as useful.
Note, of course, that genre and context will change the rate of metaphor usage. So you can probably skip over large portions of the corpus that are more procedural or where there is little interaction between speakers, if there is an easy way to identify that. You could attempt to automate a search for passages where there is a medium-sized turn by one speaker followed by a reply from another speaker, within a more flexible part of the proceedings. And from there read each instance to see what comes up. But I don't know how much time that would really save you.