So on the other day I was making a custom implementation of the io
module for maybe-fut and I had to implement the BufReader
struct, which has this definition:
pub struct BufReader<R: ?Sized> {buf: Vec<u8>,filled: usize,pos: usize,inner: R,}
And so far so good, but since I'm quite a maniac about sorting the fields in a struct in alphabetic order, I decided to change it to:
pub struct BufReader<R: ?Sized> {buf: Vec<u8>,filled: usize,inner: R,pos: usize,}
But BHAM! After just this little change, the compiler started to complain and won't compile the code anymore:
error[E0277]: the size for values of type `R` cannot be known at compilation time--> src/main.rs:3:12|1 | pub struct BufReader<R: ?Sized> {| - this type parameter needs to be `Sized`2 | buf: Vec<u8>,3 | inner: R,| ^ doesn't have a size known at compile-time|= note: only the last field of a struct may have a dynamically sized type= help: change the field's type to have a statically known sizehelp: consider removing the `?Sized` bound to make the type parameter `Sized`
But why's that? Well, we need to explain a few things first.
Sized and ?Sized
First of all, what's the difference between Sized
and ?Sized
?
We're for sure all familiar with the Sized
trait, which is automatically implemented for types that have a known size at compile time. This means that the compiler knows how much memory to allocate for a variable of that type.
In general most of the types we use in Rust are Sized
, like i32
, f64
, String
, etc. However, there are some types that do not have a known size at compile time, such as trait objects (dyn Trait
), slices ([T]
), and unsized types in general.
While it exists a Sized
trait, there's not a Unsized
trait, so if we want to specify that a type can be either Sized
or not, we use the ?Sized
bound.
So, ?Sized
means that it may be Sized
, but not for sure.
Why do we need to specify ?Sized?
Okay, but couldn't we just omit ?Sized
if it's not needed?
Well, the reason is that Rust always assumes that type parameters are Sized
by default. This means that if you define a type parameter without any bounds, Rust will assume that it must be Sized
.
So doing
struct Foo<T> {value: T,}
or
struct Foo<T: Sized> {value: T,}
is the same, and both will require T
to be Sized
.
When do we need ?Sized?
Let's say that you can't actually instantiate a struct with an unsized type in it, like this:
struct Foo<T: ?Sized> {value: T,}fn main() {let foo: [u8] = Foo { value: [1, 2, 3] };}
To do so, you need like to box it, let's see this example:
trait MyTrait {fn foo(&self);}struct Wrapper<T: ?Sized> {inner: Box<T>,}fn use_trait_object(w: Wrapper<dyn MyTrait>) {w.inner.foo();}struct Foo;impl MyTrait for Foo {fn foo(&self) {println!("Foo");}}fn main() {let foo = Foo;let wrapper: Wrapper<dyn MyTrait> = Wrapper { inner: Box::new(foo) };use_trait_object(wrapper);}
So we have a Wrapper
that takes a dyn MyTrait
as a type parameter, which is an unsized type. We can only use it by boxing it, because the size of dyn MyTrait
is not known at compile time.
But of course, Box<dyn MyTrait>
is Sized
, so we can use it in a struct that has a Box
field.
So you may, as me, think that this doesn't make any sense! Box<dyn MyTrait>
is Sized
, so why do we need to specify ?Sized
in the Wrapper
struct?
The reason is that, while Box<dyn MyTrait>
is Sized
, the type parameter T
can't be used if we don't tell the compiler that it may be ?Sized
. If we don't specify ?Sized
, the compiler will assume that T
is Sized
, and it won't allow us to use unsized types like dyn MyTrait
.
Indeed if we try to remove the ?Sized
bound from the Wrapper
struct, we'll get a compilation error:
fn use_trait_object(w: Wrapper<dyn MyTrait>) {| ^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time|= help: the trait `Sized` is not implemented for `(dyn MyTrait + 'static)`note: required by an implicit `Sized` bound in `Wrapper`
So it doesn't matter if T
is wrapped in a Box
, what matters is that T
is sized, if we don't specify ?Sized
.
Why fields order matters
Now that we know what ?Sized
is, let's go back to the BufReader
struct.
Why does the order of the fields matter when using ?Sized
?
We could just accept the fact that only the last field of a struct can be ?Sized
. But I'm not that kind of person, I want to understand why!
The reason is actually quite simple: the order of the fields in a struct determines how the memory is laid out.
When you have a struct with a ?Sized
field, Rust needs to know how much memory to allocate for the other fields. If the ?Sized
field is not the last one, Rust can't easily determine the layout of the struct, because the type could either be Sized
or not; and if not things get complicated.
Does this impact also sized types?
At this point I've never thought about this, but should we also think about the order of fields when using Sized
types?
Well, the answer is: sometimes.
For instance, if we have a struct like this:
struct A {a: u8,b: u64,c: u8,}
and this:
struct B {a: u64,b: u8,c: u8,}
We expect that the size of A
and B
will be different, because the order of the fields is different, and will create different padding in memory.
Despite this, if we check the size of both structs, we'll see that they are equally sized:
fn main() {println!("Size of A: {}", std::mem::size_of::<A>());println!("Size of B: {}", std::mem::size_of::<B>());}
The output will be 16
for both struct.
So, in this case, the order of the fields doesn't matter, because Rust will align the fields in a way that they fit in the same memory space.
However, there is an exception to this rule, which is when we use the #[repr(C)]
attribute.
Repr C
For instance, if we use the #[repr(C)]
attribute, which is used to specify that the struct should have a C-compatible layout, the order of the fields will matter.
#[repr(C)]struct A {a: u8,b: u64,c: u8,}#[repr(C)]struct B {a: u64,b: u8,c: u8,}fn main() {println!("Size of A: {}", std::mem::size_of::<A>());println!("Size of B: {}", std::mem::size_of::<B>());}
The output will be different this time:
Size of A: 24Size of B: 16
So we can say that we can order fields in a struct without worrying about the size, unless we use the #[repr(C)]
attribute.
But again, why only the last field?
The last paragraph we've just read has just proven that Rust already knows how to handle the order of fields in a struct, so why does it only allow ?Sized
for the last field?
The reason is that while Rust knows how to handle the order of fields in a struct on a memory based, but not semantically, which is required for DSTs (Dynamically Sized Types).
So even if it's quite complicated to understand, it is actually quite simple: the last field of a struct is the only one that can be ?Sized
because it is the only one that can be dynamically sized.