Discussion about this post

User's avatar
Mark Ramm's avatar

Testing over the weekend shows improved performance of Gemini, with 2 out of 7 tests showing sufficient epistemic flexibility to recognize the facts in the Crisis Published document. Not sure if this is random variance, or evidence that things are improving, but it does seem to be a good sign.

Amy Wright's avatar

I'm flatly gobsmacked. "It then fabricated technical evidence, including dozens of false "404 Not Found" errors for working news links, and ultimately suggested I might be living in a simulation rather than admit it was wrong." Talk about mis-information! I enjoyed your analysis of this issue. Thanks!

No posts

Ready for more?