Artificial Intelligence Discussion

Yes, MS has added it everywhere, including as a formula.

1 Like

It couldn’t find the answer on the internet so it just guessed?

Or, did it write, ā€œMy dog ate my homework,ā€ seeing that as an internet-evidenced historically successful excuse to avoid doing it?

LLMs are still not perfect at math, however if you want to give it a go–

I recommend using Gemini 2.5 pro (which is free via google) and using the two prompts in this arvix, along with their temperature (0.1):

https://arxiv.org/pdf/2507.15855

It gave an answer with the steps walked through, but apparently the steps were wrong.

You can try it yourself if you want. I just typed it in. I’m not sure if you can feed the AI the image…

OK.
Note that Area (Tri ABD) = 1.5 * Area (Tri ADC).
Note that Area (Tri ABE) = Area (Tri BCE).

Ideas:

  1. Determine the Areas of the three non-shaded Tri’s separately or two at a time.
  2. Or, make 8 equations and 8 unknowns:
    Area (Tri ABC) = Area (Tri ABD) + Area (Tri ADC)
    Area (Tri ABC) = Area (Tri ABE) + Area (Tri EBC)
    …

AI took an approach giving the points coordinates. The right answer apparent uses Menelaus theorem. Taking a look at it, this seems to be a direct application of it.

LLMs are terrible at applying logic to conceptual math problems.

So in the case of FAs example, it was trying to apply the steps to explain a conceptual math problem (that it likely had not seen before in the same way).

And it does the steps BUT then fails in extrapolating the correct conceptual logic behind the diagram.

It hallucinates basically. Still thinks its correct because thats what it is designed to do.

Shrug.
I fed the problem to gemini. It’s been thinking for 5 minutes now. Will tell you if it cracks it. Probably I’ll run out of free brainpower first.

I do not recall that.
.
.
.
.
.

Summary

My dog ate Menelaus’ Theorem.

2 Likes

Came up with an answer after struggling for some time… I’m asking another instance of the AI to verify the answer.

So the AI came up with 7360 (using Menelaus Theorem as the first step). The 2nd AI thought the answer was wrong. But 1st AI thought the 2nd AI was wrong, and rewrote its answer. Then a 3rd AI thought it was just fine. And honestly I probably copied and pasted everything wrong anyway.

Fish, is the answer 7360?

1 Like

And I just typed it in also. I don’t think it can see well enough to make sense of the words or diagram.

Trying to use Menelaus:

In Euclidean geometry, Menelaus’s theorem , named for Menelaus of Alexandria, is a proposition about triangles in plane geometry. Suppose we have a triangle ā–³ABC, and a transversal line that crosses BC, AC, AB at points D, E, F respectively, with D, E, F distinct from A, B, C.
I am having some issue making a single transversal line that passes through all three sides of a triangle.

According to the black box, AFD is a traversal of triangle BCE

I’m sure that I didn’t learn Menelaus’ theorem in school.

So, what does it take to answer math questions correctly? Typically, some combination of ā€œI’ve learned something that seems relevant to this questionā€ and ā€œI can extrapolate from what I’ve learned to solve the problemā€.

AI would seem to dominate on the first. It is basically taking an open book test where the book is anything that might be in its (probably) huge training set.

1 Like

I get an answer of 10,120 for the area of ABC:

Let X = area of ABC and note that BC = 5a and AC = 2b.

ABD = 3/2*a*h1 = 3/5*X
BCE = 1/2*b*h2
ADC = a*h1

ADC + BEC = a*h1 + 1/2*b*h2 - 2024 == K

Then X = ABD + BCE + ADC - K + 2024

Inserting the above values, we get the simplified expression of

X = 3/5*X + 2*2024

Solving for X results in X = 5*2024 = 10,120

And that is true when I’m working with human editors as well.

It does help that most of the human editors I work with, I’ve worked with for years. So they know me – and I know them – so when I push back on why I want to use some weird verbiage, they know I’m doing it for effect. (Yes, I can also explain exactly what I’m attempting. But yes, they also explain why the SOA doesn’t like me using the subsection header: ā€œWHY PIE CHARTS SUCKā€ – (I am tempted to write an article titled ā€œWhy Pie Charts Don’t Have to Suckā€)).

For the AI editors, some have degraded in quality, but that’s the nature of the training cycles. I have to turn off CoPilot/Grammarly when they get in a sucky phase.

2 Likes

Copilot came up with: 4554.

I’m told the correct answer is 7360.

1 Like

Chatbot’s are great for rewriting content, and rough ideas. It’s crazy to use them for anything that requires a specific answer.
I’ve got access to a chatbot from a life insurance company, for internal purposes. All it does is kick out wrong answers - using the company’s own documentation. Geesh.

1 Like

I’d be interested in seeing ā€œthe workā€ for this answer.

2 Likes