Mytype value; somefunction(&value) is different from mytype* value; somefunction(value)? Yes / no / why?

Aethalides · wrote on 10 Feb 2013, 16:23

Hi

As you can guess from the question, I am quite a newbie. I know what passing by reference is (&) and I know what pointers are and how you dereference them and such (or I think I know).

But the thing is, when I pass my pointer as *, the compiler complains that it cannot convert ** to *, implying that uname is expecting an *

So please can you tell me what the reason is for the observed behaviour:

@#include <sys/utsname.h>
#include <iostream>

int main(void) {

utsname data;

utsname* pointerData;

int intResult=uname(&data); // this works

int intOtherResult=uname(pointerData); // this does not (intOtherResult=0)

if(intResult==0) {

std::cout << "sysname: " << data.sysname << "; nodename: " << data.nodename << "; release: " <<

data.release << "; version: " << data.version << "; machine: " << data.machine << std::endl;

} else {

std::cout << "No result" << std::endl;
}

return intResult;
}@

Zlatomir · wrote on 10 Feb 2013, 16:28

If you use pointer you also need to create an object, so use:
@
utsname* pointerData = new utsname();

int intOtherResult=uname(pointerData); // this should work too
@

Aethalides · wrote on 10 Feb 2013, 16:39

Right that does work.

Thanks Zlatomir :)

utcenter · wrote on 11 Feb 2013, 05:14

You don't "initialize" your pointer to anything. It would work if you:

@pointerData = &data;@

When you create a pointer, if you don't initialize it to the address of an actual instance, it contains garbage data.

But int hat particular case you don't really need to instantiate the utsname dynamically, new gets you a pointer plus actual dynamic allocation, the & operator will convert any local instance to a memory address, e.g. a pointer to the instance.

There is no pass by reference in C, you can only pass by value or by pointer, but pointers can also be incremented, making it possible to walk off the actual memory address you need. So you need to make it a const pointer and you need to dereference it when you use it, that is why in C++ references were introduced, which save those two steps, so it is faster to type and less ugly to the eyes. A reference is an automatically dereferenced pointer with disabled pointer arithmetic.

There might be some confusion because you don't need to dereference references, and you work with them as if they are local objects, but in reality it is just a memory address, just like a pointer is, but optimized for a certain usage scenario.

So, to answer the question from the topic title - technically there is no difference. A pointer is a memory address, the & operator refers to the address of an instance... which is a memory address. It is pretty much OK to use &instance in places that require pointer to instance. The pitfalls - some function might potentially try to delete a pointer, and if the memory of that object is not dynamically allocated, calling delete on a local object memory address results in unspecified behavior, which means there is no guarantee what will happen. Or the function might expect a C style array or string literal, which is effectively a pointer, and try to increment it internally and walk off the memory address, crash the program, corrupt neighboring objects and so on.

Also since you say you are a newbie, I'd suggest to get into the habit of putting the * next to the identifier, not the type, which is somewhat confusing you are declaring the type as "type pointer"

@int* a, b;@

This is a somewhat often mistake, newbs assume both a and b are pointers because it looks like we are declaring pointers here, but in reality only a is a pointer, b is an int. A more readable convention would be:

@int *a, b;@

Putting the * with the identifier is much clearer that it applies only to that particular identifier, plus it is more uniform, because if you want a and b to be pointers, it is a little awkward putting the first * with the type and the second with the identifier.

@int* a, *b; // not very consistent
int *a, *b; // much better
int *ptrA, *ptrB, c; // even better when pointer are decorated to remind it down the line
@

andre · wrote on 11 Feb 2013, 10:02

[quote author="utcenter" date="1360559679"]
Also since you say you are a newbie, I'd suggest to get into the habit of putting the * next to the identifier, not the type, which is somewhat confusing you are declaring the type as "type pointer"

@int* a, b;@

This is a somewhat often mistake, newbs assume both a and b are pointers because it looks like we are declaring pointers here, but in reality only a is a pointer, b is an int. A more readable convention would be:

@int *a, b;@

Putting the * with the identifier is much clearer that it applies only to that particular identifier, plus it is more uniform, because if you want a and b to be pointers, it is a little awkward putting the first * with the type and the second with the identifier.

@int* a, *b; // not very consistent
int *a, *b; // much better
int *ptrA, *ptrB, c; // even better when pointer are decorated to remind it down the line
@
[/quote]

I tend to disagree. I find it conceptually easier to think of a pointer-to-a-type as a type in itself, and that means putting the * with the type, not the variable. To prevent the issue you rightfully point out, I would suggest you simply do this instead:
@
int* a;
int* b;
@
In my opinion, that results in easier to read code. Sure, it is a few characters more to write, but because code is more often read then written, I think that is well worth it.

Obviously, variable names should have readable names, and single letters only very rarely qualify. About the only acceptable exception is using i and j as indexes in a loop.

utcenter · wrote on 11 Feb 2013, 10:25

bq. I tend to disagree. I find it conceptually easier to think of a pointer-to-a-type as a type in itself

Yes, this is the inevitable syntax war :) Of course, everyone has his own preferences, and in time it doesn't really matter to an experienced developer.

I am not some uber developer, I am pretty new to programming myself. But I think my preference is based on technical grounds.

We use a short hand when we say integer pointer, reading right to left we read " a is a pointer to integer" - there is no type "integer pointer" - you could typedef one, but the * acts as a qualifier for the identifier to its right, not as a modifier to the type on the left. That is why if you declare more than one, the star only applies to the identifier on the right. If it was conceptually regarded as an "integer pointer" type, it should apply to everything on the line, just like if you typedef a intPointer type.

See, there is actual logic, based on how compilers parse and interpret the actual source.

Plus I don't really see any downside to putting the * next to the identifier. I doubt anyone would mistake it for dereferencing inside of a declaration.

To be honest, I never ever got any substantiated reasoning for putting the * with the type. The * doesn't affect the type, it affects the identifier. But maybe I am missing something, and without any intent to start a war, it would be nice if you have something more than reading left to right, which is not how declarations in C++ are being read. And I for one think it is a good idea to get into the practice of reading the declarations correctly, so that one might some day be able to decipher something like

@char ((**a[][32])())[];@

BTW, your post formatting is a little off, it is a little hard to tell your response from the quote, might want to fix it.

andre · wrote on 11 Feb 2013, 10:43

I fixed the formatting, it's an annoying bug in the forum software that caused it.

I also don't want to start a formatting war, but one argument that you can look at pointer-to as a separate type, is if you look at the memory layout for it. A pointer-to is always just the size of a pointer (4 or 8 bytes, usually), and something that fits in a register. Also, it is an object on which some operations are defined differently than on, say, your int. Those differences make it, IMHO, a different beast than the type it is pointing to, and thus worth to think of as a type of its own. Hence the * with the type.

I am willing to admit that there are also situations where thinking the other way around makes more sense.

I do think that people writing statements (in actual code) like you just did, are not writing very readable code and are thus, IMO, probably more interested in showing how clever they are than in creating maintainable software. That's not a good trait for a programmer on your team to have. It would actually make a nice interview question: ask people what that statement of yours means :-) I would be looking for the observation I just made from the candidate...

utcenter · wrote on 11 Feb 2013, 11:00

Yes, I am willing to grant it is easy and intuitive, but can you grant that is not technically correct?

You can look at anything as anything, but this doesn't change the fact an actual type would affect every of the following listed identifiers, not just the first one. The type is still the type and the * technically applies to the identifier, regardless of how it is conceptual perceived.

And no, I don't mean to troll about this, my point is that it is beneficial in the long term to force your mind to the technically correct and I admit less friendly alternative, because it is how the language works. I am also willing to grant that this might not apply to veteran developers such as yourself, but can be beneficial for a newbie to wrap his mind around that concept. Because even though it his particular case one can avoid the "harsh reality" it is not always the case, and the sooner this reality is faced the sooner one is prepared to avoid its pitfalls.

Since both are correct, what puts down the weight for me is determined by a pragmatic principle I tend to follow - "Technically correct is the best kind of correct"

No doubt, single declaration per line is the most readable, plus it is good if your pay is measured in LOC. But I often do multiple mixed on a single line. And I happened to make that exact mistake until I got into the habit of putting the * next to the token it applies to.

andre · wrote on 11 Feb 2013, 11:11

Actually, I think that C++ allows both ways of thinking. You are certainly right that for declarations like the ones you showed, the -with-identifier is the correct way to think about the meaning. However, I think it is not for nothing that you are allowed to write both
@
Type ident;
@
and
@
Type *ident;
@

I don't think either one is the right way to think about pointers in all circumstances. Perhaps that is the message that we'd need to tell newbie programmers?

utcenter · wrote on 11 Feb 2013, 11:37

And let's not forget that

@Type*ident;@

is possible too :) Or any arbitrary amount of spaces, tabs, new paragraphs and whatnot. That is one of the many reasons I dislike Python, because to me it sounds like a terrible idea for whitespace to matter.

Aethalides · wrote on 14 Feb 2013, 20:53

Thank you all for your thoughtful replies, and although I am a newbie when it comes to C/C++, I am not by a long shot new to programming, and it is in fact my day job.

Which is why I will focus more on understanding how the * applies to the creation of variables rather than its position because I fully well know I will encounter many variants including

@Type * Identifier;
Type* Identifier;
TypeIdentifier;
Type * Identifier;
Type / Pointer! / * / Watch Out! */ Identifier;@

Knowing the theory behind it will make it easier for me to realise that they are all the same and allow me to emphasise ambiguous situations with extra comments.