قالب وردپرس درنا توس
Home / Tips and Tricks / How to view binaries from the Linux command line

How to view binaries from the Linux command line



  A stylized Linux terminal with lines of green text on a laptop.
fatmawati achmad zaenuri / Shutterstock

Do you have a mystery file? The command Linux file will quickly tell you what type of file it is. However, if it is a binary file, you can find out more about it. file has a whole host of stable mates that will help you analyze it. We'll show you how to use some of these tools.

Identifying File Types

Files usually have attributes that allow software packages to identify what type of file it is and what its data represents. It wouldn't make sense to try opening a PNG file in an MP3 music player, so it is both useful and pragmatic for a file to carry some form of ID.

These may initially be a few typical bytes of the file. This allows a file to be explicit about its format and content. Sometimes the file type is derived from a distinctive aspect of the internal organization of the data itself, also known as the file architecture.

Some operating systems, such as Windows, are guided entirely by the extension of a file. You can call it gullible or trust, but Windows assumes that any file with the DOCX extension is really a DOCX word processing file. Linux is not like that, as you will soon see. It wants proof and looks in the file to find it.

The tools described here were already installed on the Manjaro 20, Fedora 21

, and Ubuntu 20.04 distributions that we used to research this article. Let's start our research with the assignment .

With the Command Command

We have a collection of different file types in our current directory. They are a mix of document, source code, executable and text files.

The ls command will show us what's in the directory, and the -hl (human-) readable sizes, long list) option shows us the size of any file:

  ls -hl 

  ls -hl in a terminal window.

Let's try the file on some of these and see what we get:

  file build_instructions.odt 
  file build_instructions.pdf 
  file COBOL_Report_Apr60.djvu [19659019] file build_instructions.odt in a terminal window. " width="646" height="167" src="/pagespeed_static/1.JiBnMqyl6S.gif" onload="pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);" onerror="this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);"/> 

The three file formats are correctly identified. Where possible, file gives us some more information. The PDF file is said to be in the version 1.5 format.

Even if we rename the ODT file to an extension with the arbitrary value XYZ, the file is still correctly identified both within the files ] file browser and on the command line with file .

 OpenDocument file correctly identified in the file browser, even if the extension is XYZ.

Within the Files file browser, it has got the correct icon. On the command line, file ignores the extension and looks in the file to determine its type:

  file build_instructions.xyz 

  file build_instructions.xyz in a terminal window.

Using file on media, such as picture and music files, usually provides information about their format, encoding, resolution, etc.:

  file screenshot.png 
  file screenshot.jpg 
  file Pachelbel_Canon_In_D.mp3 

  file screenshot.png in a terminal window.

Interestingly, even with plain text files, file does not judge the file by extension. For example, if you have a file with the extension ".c", which contains plain text by default but no source code, file will not confuse it with a real C source file: the file function

  + headers.h 
  makefile file 
  hello.c file 

  file function + headers.h in a terminal window.

file correctly identifies the header file (".h") as part of a C source code collection of files, and knows that the makefile is a script.

Using file with binary files

Binary files are more of a "black box" than others. Image files can be viewed, sound files can be played and document files can be opened with the appropriate software package. Binary files are more of a challenge, however.

For example, the files "hello" and "wd" are binary executables. They are programs. The file named "wd.o" is an object file. When source code is compiled by a compiler, one or more object files are created. These contain the machine code that the computer will eventually run when the finished program is run, along with information for the linker. The left one checks each object file for function calls to libraries. It links them to all libraries the program uses. The result of this process is an executable file.

The "watch.exe" file is a binary executable file that is compiled to run on Windows:

  file wd 
  file wd.o 
  file hello 
  file watch.exe [19659047] file wd in a terminal window. " width="646" height="337" src="/pagespeed_static/1.JiBnMqyl6S.gif" onload="pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);" onerror="this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);"/> 

First, file tells us that the "watch.exe" file is a PE32 + executable console program for the x86 processor family on Microsoft Windows. PE stands for portable executable format, which has 32- and 64-bit versions. The PE32 is the 32-bit version and the PE32 + is the 64-bit version.

The other three files are all identified as Executable and Linkable Format (ELF) files. This is a standard for executables and files for shared objects, such as libraries. We will be reviewing the ELF header format soon.

What you notice is that the two executables ("wd" and "hello") are identified as Linux Standard Base (LSB) shared objects, and the object file "wd.o" is identified as an LSB movable . The word executable is clear in its absence.

Object files can be moved, which means that the code can be loaded into memory at any location. The executables are listed as shared objects because they are created by the linker from the object files so that they inherit this capability.

This allows the Address Space Layout Randomization (ASMR) system to load the executable files into memory at addresses of your choice. Standard executables have a loading address encoded in their headers, which determines where they are loaded into memory.

ASMR is a security technique. Loading executable files into memory at predictable addresses makes them susceptible to attack. This is because their access points and the location of their functions will always be known to attackers. Position Independent Executables (PIE) at any address overcome this sensitivity.

If we compile our program with the gcc compiler and provide the -no-pie option, then & # 39; ll generate a conventional executable file.

With the option -o (output file) we can enter a name for our executable file:

  gcc -o hello -no-pie hello.c 

We I'm using file on the new executable file and see what has changed:

  file hello 

The size of the executable file is the same as before (17 KB):

  ls -hl hello 

 gcc -o hello -no-pie hello.c in a terminal window.

The binary file is now identified as a standard executable file. We only do this for demonstration purposes. If you compile applications this way, you will lose all the benefits of the ASMR.

Why is an executable file so large?

Our example hello program is 17 KB, so it could hardly be called large, but then everything is relative. The source code is 120 bytes:

  cat hello.c 

What does the binary file matter if it prints only one string to the terminal window? We know there is an ELF header, but that is only 64 bytes long for a 64-bit binary. Obviously it should be something else:

  ls -hl hello 

 cat hello.c in a terminal window.

Let's scan the binary file with the command strings as a simple first step to discover what's inside. We put it in less :

  strings hello | less 

 strings hello | less in a terminal window.

There are many strings within the binary, in addition to the "Hello, Geek world!" from our source code. Most are labels for regions within the binary file and the names and mapping data of shared objects. These include the libraries and functions within those libraries on which the binary file depends.

The ldd command shows us the shared object dependencies of a binary:

  ldd hello 

 ldd hello in a terminal window.

There are three entries in the output, and two of them contain a folder path (the first one is not):

  • linux-vdso.so: Virtual Dynamic Shared Object (VDSO) that allows a set of kernel space routines can be accessed by a user space binary. This avoids the overhead of a context change from the user core mode. VDSO shared objects adhere to the Executable and Linkable Format (ELF) format, which allows them to be dynamically linked to the binary at runtime. The VDSO is assigned dynamically and benefits from ASMR. The VDSO capability is provided by the standard GNU C library if the kernel supports the ASMR scheme.
  • libc.so.6: The common object GNU C Library.
  • / lib64 / ld-linux-x86-64.so.2: This is the dynamic linker that wants to use the binary file. The dynamic linker queries the binary to find out what dependencies it has. It launches those shared objects into memory. It prepares the binary file to run and to find and access the dependencies in memory. Then the program is started.

The ELF header

We can examine and decode the ELF header using the utility readelf and option -h (file header):

  readelf -h hello 

 readelf -h hello in a terminal window.

The header is interpreted for us.

 Output from readelf -h hello in a terminal

The first byte of all ELF binaries is set to hexadecimal value 0x7F. The next three bytes are set to 0x45, 0x4C and 0x46. The first byte is a flag that identifies the file as an ELF binary file. To make this clear, the following three bytes spell "ELF" in ASCII:

  • Class: Indicates whether the binary file is a 32 or 64 bit executable (1 = 32, 2 = 64) . [19659075] Data: Indicates the used lifetime. Endian encoding defines the way multibyte numbers are stored. Big-endian encoding first stores a number with the most significant bits. In Little-Endian encoding, the number is first stored with the least significant bits.
  • Version: The version of ELF (currently it is 1).
  • OS / ABI: Specifies the type of application binary interface in use. This defines the interface between two binary modules, such as a program and a shared library.
  • ABI version: The version of the ABI.
  • Type: The ELF binary type. The common values ​​are ET_REL for a movable source (such as an object file), ET_EXEC for an executable file compiled with the -no-pie flag, and ET_DYN for an ASMR aware executable file.
  • Machine: The Instruction Set Architecture. This indicates the target platform for which the binary file was created.
  • Version: Always set to 1, for this version of ELF.
  • Access point address: The memory address within the binary file on which execution takes place begins.

The other entries are the size and number of regions and sections within the binary file so that their locations can be calculated.

A quick look at the first eight bytes of the binary file with hexdump shows the signature byte and the "ELF" string in the first four bytes of the file. The option -C (canonical) gives us the ASCII representation of the bytes next to their hexadecimal values, and with the option -n (number) we can specify how many bytes we want to see :

  hexdump -C -n 8 hello 

 hexdump -C -n 8 hello in a terminal window.

objdump and the Granular View

If you want to see the core -grit detail, you can use the objdump command with the -d (disassemble) option :

  objdump -d hello | less 

 objdump -d hello | less in a terminal window

This disassembles the executable machine code and displays it in hexadecimal bytes next to the equivalent of the assembly language. The address location of the first bye in each line is displayed on the far left.

This is only useful if you can read the assembly language, or if you are curious about what is going on behind the curtain. There is a lot of output, so we have led it to less .

 Well of objdump -d hello | less in a terminal window

Compile and Link

There are many ways to compile a binary file. For example, the developer chooses whether to include debug information. The way the binary file is linked also plays a role in its content and size. If the binary references share objects as external dependencies, it will be less than one to which the dependencies statically refer.

Most developers already know the commands we've covered here. For others, however, they offer some easy ways to browse and see what's inside the binary black box.




Source link