When a software artefact is being tested it is necessary to identify the set of tests, the test cases, that provides a good code coverage, which increases the probability of detecting faults in the code. There are two strategies of testing: black-box testing and white-box testing. In black-box testing the testers verify the results of the execution of the artefact against a specification of its behavior, whereas in white-box testing testers know how the artefact was implemented. Therefore, in the former the tests are driven by a description that was actually used to produce the software artefact, and the in the later the tests are driven by the artefact description itself.
A black-box test case exercises the code according to the problem it is solving, which does not allow to know what are the paths of execution which are being tested. The testers try to identify what are the variations in the problem that will produce different results in order to have a good coverage. For instance, equivalence partitioning is a technique which groups inputs that produce equivalent results, from the problem description perspective. For instance, a empty stack and non-empty stack define different partitions.
White-box test cases can be design to assure that some parts of the code are exercised. Actually, with white-box testing, it is possible to create test cases for statement coverage, all statements are executed at least once, branch coverage, all alternatives in branches are executed at least once, and path coverage, every path is executed at least once. Note that in a real software system it is not possible to have path coverage.
Obviously, white-box testing can be more accurate. However it is not always possible, or even convenient, to do white-box testing. Developers do white-box testing but the members of the quality assurance team do black-box testing, which is less biased by how the software was build, and allows the identification of faults that developers may miss. Therefore, both white-box and black-box testing should be used.
After any definition we can always ask boundary questions. In what concerns the difference between black-box and white-box testing where does the concept of full-stack fits? Is it specification or implementation?