Minggu, 09 April 2023

Rust Basics Series #3: Data Types in Rust

Rust Basics Series #3: Data Types in Rust

In the previous post about the Rust programming language, we looked at variables, constants and shadowing.

It is only natural to cover data types now.

What are data types?

Change the order of these words and you get your answer; "data types" -> "type of data".

The computer stores data as 0s and 1s but to make sense of it when reading, we use data type to say what those 0s and 1s mean.

Rust has two types of data types:

  1. Scalar data type: Types that store only a single value.
  2. Compound data type: Types that store multiple values, even values of different types.

In this article, I shall cover scalar data types. I will go through the second category in the next article.

Following is a brief overview of the four main categories of Scalar data types in Rust:

  • Integers: Stores whole numbers. Has sub-types for each specific use case.
  • Floats: Stores numbers with a fractional value. Has two sub-types based on size.
  • Characters: Stores a single character of UTF-8 encoding. (Yes, you can store an emoji* in a character.)
  • Booleans: Stores either a true or a false. (For developers who can't agree if 0 is true or if 0 means false.)

Integers

An integer in the context of a programming language refers to whole numbers. Integers in Rust are either Signed or Unsigned. Unsigned integers store only 0 and positive numbers, while Signed integers can store negative numbers, 0 and positive numbers.

💡
The range of Signed integers begins from -(2n-1) and this range ends with (2n-1)-1. Likewise, the range for Unsigned integers starts at 0 and ends with (2n)-1.

Following are the available Integer types based on the sign and length:

Rust Basics Series #3: Data Types in Rust

As you can see, Rust has Signed and Unsigned integers of length 8, 16, 32, 64 and even 128!

The integers with *size vary based on the architecture of the computer. On 8-bit micro-controllers, it is *8, on 32-bit legacy computers, it is *32 and on modern 64-bit systems, it is *64.

The use of *size is to store data that is mostly related to memory (which is machine dependent), like pointers, offsets, etc.

💡
When you do not explicitly specify a subset of the Integer type, the Rust compiler will infer it's type to be i32 by default. Obviously, if the value is bigger or smaller than what i32 can hold, the Rust compiler will politely error out and ask you to manually annotate the type.

Rust not only allows you to store integers in their decimal form but also in the binary, octal and hex forms too.

For better readability, you can use underscore _ as a replacement for commas in writing/reading big numbers.

fn main() {
    let bin_value = 0b100_0101; // use prefix '0b' for Binary representation
    let oct_value = 0o105; // use prefix '0o' for Octals
    let hex_value = 0x45; // use prefix '0x' for Hexadecimals
    let dec_value = 1_00_00_000; // same as writing 1 Crore (1,00,00,000)

    println!("bin_value: {bin_value}");
    println!("oct_value: {oct_value}");
    println!("hex_value: {hex_value}");
    println!("dec_value: {dec_value}");
}

I have stored the decimal number 69 in binary form, octal form and hexadecimal form in the variables bin_value, oct_value and hex_value respectively. In the variable dec_value, I have stored the number 1 Crore (10 million) and have commas with underscores, as per the Indian numbering system. For those more familiar with the International numbering system, you may write this as 10_000_000.

Upon compiling and running this binary, I get the following output:

bin_value: 69
oct_value: 69
hex_value: 69
dec_value: 10000000

Floating point numbers

Floating point numbers, or more commonly known as "float(s)" is a data type that holds numbers that have a fractional value (something after the decimal point).

Unlike the Integer type in Rust, Floating point numbers have only two subset types:

  • f32: Single precision floating point type
  • f64: Double precision floating point type

Like the Integer type in Rust, when Rust infers the type of a variable that seems like a float, it is assigned the f64 type. This is because the f64 type has more precision than the f32 type and is almost as fast as the f32 type in most computational operations. Please note that both the floating point data types (f32 and f64) are Signed.

📋
The Rust programming language stores the floating point numbers as per the IEEE 754 standard of floating point number representation and arithmetic.
fn main() {
    let pi: f32 = 3.1400; // f32
    let golden_ratio = 1.610000; // f64
    let five = 5.00; // decimal point indicates that it must be inferred as a float
    let six: f64 = 6.; // even the though type is annotated, a decimal point is still
                       // **necessary**

    println!("pi: {pi}");
    println!("golden_ratio: {golden_ratio}");
    println!("five: {five}");
    println!("six: {six}");
}

Look closely at the 5th line. Even though I have annotated the type for the variable six, I need to at least use the decimal point. If you have something after the decimal point is up to you.

The output of this program is pretty predictable... Or is it?

pi: 3.14
golden_ratio: 1.61
five: 5
six: 6

In the above output, you might have noticed that while displaying the value stored inside variables pi, golden_ratio and five, the trailing zeros that I specified at the time of variable declaration, are missing.

While those zeros are not removed, they are omitted while outputting the values via the println macro. So no, Rust did not tamper with your variable's values.

Characters

You can store a single character in a variable and the type is simply char. Like traditional programming languages of the '80s, you can store an ASCII character. But Rust also extends the character type to store a valid UTF-8 character. This means that you can store an emoji in a single character 😉

💡
Some emojis are a mix of two existing emojis. A good example is the 'Fiery Heart' emoji: ❤️‍🔥. This emoji is constructed by combining two emojis using a zero width joiner: ❤️ + 🔥 = ❤️‍🔥

Storing such emojis in a single Rust variable of the character type is not possible.
fn main() {
    let a = 'a';
    let p: char = 'p'; // with explicit type annotation
    let crab = '🦀';

    println!("Oh look, {} {}! :{}", a, crab, p);
}

As you can see, I have stored the ASCII characters 'a' and 'p' inside variables a and p. I also store a valid UTF-8 character, the crab emoji, in the variable crab. I then print the characters stored in each of these variables.

Following is the output:

Oh look, a 🦀! :p

Booleans

The boolean type in Rust stores only one of two possible values: either true or false. If you wish to annotate the type, use bool to indicate the type.

fn main() {
    let val_t: bool = true;
    let val_f = false;

    println!("val_t: {val_t}");
    println!("val_f: {val_f}");
}

The above code, when compiled and executed results in the following output:

val_t: true
val_f: false

Bonus: Explicit typecasting

In the previous article about Variables in the Rust programming language, I showed a very basic temperature conversion program. In there, I mentioned that Rust does not allow implicit typecasting.

But that doesn't mean that Rust does not allow explicit typecasting either ;)

To perform explicit type casting, the as keyword is used and followed by the data type to which the value should be cast in.

Following is a demo program:

fn main() {
    let a = 3 as f64; // f64
    let b = 3.14159265359 as i32; // i32

    println!("a: {a}");
    println!("b: {b}");
}

On line 2, instead of using '3.0', I follow the '3' with as f64 to denote that I want the compiler to handle type casting of '3' (an Integer) into a 64-bit float. Same with the 3rd line. But here, the type casting is lossy. Meaning, that the fractional element is completely gone. Instead of storing 3.14159265359, it is stored as simply 3.

This can be verified from the program's output:

a: 3
b: 3

Conclusion

This article covers the Primitive/Scalar data types in Rust. There are primarily four such data types: Integers, Floating point numbers, Characters and Booleans.

Integers are used to store whole numbers and they have several sub-types based on either they are signed or unsigned and the length. Floating point numbers are used to store numbers with some fractional values and have two sub-types based on length. The character data type is used to store a single, valid UTF-8 encoded character. Finally, booleans are used to store either a true or false value.

In the next chapter, I'll discuss compound data types like arrays and tuples. Stay tuned.



from It's FOSS https://ift.tt/E9v5qIi
via IFTTT

Tidak ada komentar:

Posting Komentar