LibLISA – Instruction Discovery and Analysis on x86-64

83 points by Luc 3 days ago

saagarjha 2 days ago

This is neat but the analysis of their work leaves a bit to be desired. You can't just randomly select instructions and see if you did a good job, because the instruction space is not really uniform on any axis that people care about. For example, on a hypothetical ISA that has most the encoding space that is, like, simple arithmetic ops then you can get "good" coverage really easily. But that's not actually very useful because the instructions people care about when doing this kind of analysis are specific and usually more esoteric, and difficult to analyze with a simple bitstring approximation. Like, this definitely cannot discover the semantics of syscall, or rdrand. The authors claim they would have been able to discover reptar if they extended their work slightly, but I think it is pretty dubious that their methodology is powerful enough to do so.

pabs3 2 days ago

Reminds me of sandsifter, a fuzzer for the x86 ISA:

https://github.com/xoreaxeaxeax/sandsifter

amy-petrik-214 2 days ago

great piece on application of such - https://media.defcon.org/DEF%20CON%2026/DEF%20CON%2026%20pre...
- pabs3 a day ago
  
  The video of the talk about this is well worth watching too.

specialgoodness 2 days ago

This is interesting work but it totally misses the boat when it talks about the current state of the art. They cite a 2014 version of the Goel-Hunt-et al formal x86 model in ACL2, but they fail to talk about its modern version. The modern version (developed at Centaur and then Intel!) is an ACL2 x86 model that is so precise that it can boot Linux and run user-land programs. Let me say that again: it is a formal mathematical model of a processor that is so precise that it can boot Linux and run user-land programs! This is a monumental accomplishment and is not even mentioned in their paper.

jf 2 days ago

I've long wanted to have a way to see what actually happens inside a CPU when a set of instructions are executed. I'm pretty excited after skimming this paper as it looks like they developed a technique to automatically determine how the x86-64 instructions actually work by observing real world CPU behavior.

dzaima 2 days ago

This is determining the directly observable behavior rather than anything about microarchitectural specifics.
- ddingus 2 days ago
  
  And still quite a useful tool, particularly when exploring undocumented instructions.
fragmede 2 days ago

blinkenlights might be up your alley, if you haven't seen it before.
https://justine.lol/blinkenlights/
- rbanffy 2 days ago
  
  Every computer deserves a blinking lights front panel.
westurner 2 days ago

From https://news.ycombinator.com/item?id=33563857 :
> Memory Debugger
Valgrind > Memcheck, None: https://en.wikipedia.org/wiki/Valgrind ; TIL Memcheck's `none` provides a traceback where the shell would normally just print "Segmentation fault"
DynamoRio > Dr Memory: https://en.wikipedia.org/wiki/DynamoRIO#Dr._Memory
Intel Pin: https://en.wikipedia.org/wiki/Pin_(computer_program)
https://news.ycombinator.com/item?id=22095435, : SoftICE, EPT, Hypervisor, HyperDbg, PulseDbg, BugChecker, pyvmidbg (libVMI + GDB), libVMI Python, volatilityfoundation/volatility, Google/rekall -> yara, winpmem, Microsoft/linpmem, AVML,
rr, Ghidra Trace Format: https://github.com/NationalSecurityAgency/ghidra/discussions... https://github.com/NationalSecurityAgency/ghidra/discussions... : appliepie, orange_slice, cannoli
GDB can help with register introspection: https://web.stanford.edu/class/archive/cs/cs107/cs107.1202/l... :
> Auto-display and Printing Registers: The `info reg` command [and `info all-registers` (`i r a`)]
emu86 implements X86 instructions in Python, optionally in Jupyter notebooks; still w/o X86S, SIMD, AVX-512, x86-84-v4
- westurner 2 days ago
  
  > Valgrind `none` provides a traceback where the shell would normally just print "Segmentation fault"
  What would it take to get `bash` to print a --traceback after "Segmentation fault\n", and then possibly also --gdb like pytest --pdb?
  - [ ] ENH: bash: add valgrind `none`-style ~ --traceback after "Segmentation fault" given an env var or by default?

jxors 2 days ago

Hi! I'm one of the authors. Cool to see our work show up on HN!

I'm happy to answer questions if there are any.

jtotheh 11 hours ago

This may be a really dumb question, but is that much of the behavior of an x86_64 CPU variable and undefined? Until recently I thought the chipmakers provided full information (recently I found an article about people investigating the undocumented innards of the 286, IIRC). This seems like a pretty shaky foundation for software.