I tried out the page. Honestly it's probably about the speed I'd expect if you take away all the extra fluff from a site like Twitter.
So the process for most (not all) modern compilers is
Source Language -> Compiler Front-end -> LLVM-IR -> Compiler backend -> Target language
Almost regardless of target. Target language including machine code instructions as a binary executable or in this case a web
Upon a brief inspection however it doesn't look like there's JS at all actually. At least on the front-page which is all I really inspected. What it looks like to me is that they have a server-side program that reacts to GET requests and sends back appropriate HTML.
So the majority of this site would be a normal application binary running on their server and the web interface is just a small view into that.
But I've only very briefly looked it over.
… Sort of. It's complicated, hehe.
As you probably know, Java traditionally runs on the Java Virtual Machine. But there's technically nothing preventing someone from making a compiler for the Java language that compiles to pure machine code. I don't know if this has ever been done for Java specifically but the language Kotlin, which also runs on the Java Virtual Machine, has a project called Kotlin Native to achieve exactly this.
Do different languages have strengths and weaknesses? Yes, you could absolutely say that. But it's not always as straight forward as language A fast, B uses less memory, C better for web or whatever. We can really adapt pretty much any tool to any job.
And I mean your understanding isn't completely wrong or anything.
Write the exact same program in C and Java (they're similar enough in syntax and to an extend semantics that this can be possible, class structure aside) and C will generally be faster because you don't need the JVM.
Left with the choice of competent Java or incompetent C though, Java will run faster regardless though. In my original post I mentioned linear search and binary search; Let me elaborate a bit on that.
Let's say you're looking for the number 42 in a list of 10 numbers. You know the list is sorted lowest to highest but you have no idea of the range of numbers. Could be: 1,2,3,4,10,11,14,42… or it could be 42,50,100, 20 million… and so on. You don't know. How do you find the number? A linear search would be just going through the list one number at a time. Thus the time complexity; I.e the worst case time to finish, is O(N) where N is the length of the list. That is a time to finish that linearly scales with the number of elements to look over.
A more efficient approach is to start with the middle number; The fifth element. And say "Are we looking for a bigger or smaller number?" Then take the middle number in that direction; So if 42 is smaller than the fifth element, we'd look at nr. 3. If it's still smaller we just look at 1 and 2, if it's bigger the number is in position 4.
With this way of searching our time complexity is O(log(n)) or in other words the time to finish scales logarithmically with the number of elements in the list; Every time we look, we cut the search space in half.
On a big enough dataset, the binary search method will always be faster than the linear search regardless of language or runtime.
Which brings us neatly to our next complication; Runtimes. Because a language also has a runtime component. A language in itself is not really fast or slow, but the way it compiles and runs can be. The runtime is all the things for a given execution context that are not handled ahead of time often considered a core part of the language regardless but in a way is a separate program that runs alongside programs in a given language.
Runtimes are often related to what people consider the speed of a language, and in some cases are unavoidable as well if you want to meaningfully use a given language but in other cases your reliance on the runtime is a choice. (to an extent)
And last but not least I just want to touch upon the "speed on what hardware" issue as well. Most C compiler for example can highly optimise for a specific micro-architecture. Like way beyond just x86_64, but all the way down to like "Haswell chip" or whatever. And that's fantastic but could result in terrible performance as well on another target, even if it's the same ISA, x86_64. This would of course be unlikely and require tremendous modifications to the relative speed of certain instructions, but theoretically is possible.
An interpreted environment like JavaScript usually runs in will be capable of tailoring the code to your hardware every single time you run it. The results may not be as good today, but without ever recompiling it, an update to the interpreter can improve the speed.
And in the end, as mentioned, any language can theoretically be made to do anything. Whether the tooling exists to easily achieve that at the moment I can't guarantee, but it's possible. So in short, the code you write matters more than the language it's written in