Expanding the domain of discourse reveals structure already there but hidden
To understand a complex system like a mind or an ecosystem, we have to understand a tangled web of objects, features, processes, relations, correlations, clusters, constraints, causes, and so on. It helps to find underlying explanations and generating process, and to find deep reasons that explain why many relationships are the way they are. To find those underlying explanations, it helps to know relationships between relationships: which relationships cause, explain, or constrain other relationships. When staring at an assembly of relationships and asking which relationships explain other relationships, it can seem like we're at a loss for where to go; sometimes there are clearly relationships, but there's no clear way to extract any further order, because there's no basis on which to say that one relationship explains another——the relationships are just there, and that's it.
By way of example/analogy, suppose we look at a list of facts like [3×4=12, 12/4=3, 4=12/3, 12=4×3], and ask, which facts are more foundational to which other facts? Which facts explain, ground, cause, or constitute the essence of, which other facts? Is 12/4=3 because 12=4×3, or vice versa? Is the evenness of 12 due to the evenness of 4? Or due to the evenness of 6? What if there's no answer, and all we can say is that all these facts stand together as related by derivations and proofs, with no priority given to any of them over any other?
Another example: take a smooth closed curve C in the plane. These conditions are equivalent:

There is a point p such that all points on C are equidistant from p.

C has constant curvature.

(If C bounds a disc) C bounds the maximum area that's possible to bound with a curve of the same length as C.

The group of isometries of the plane that maps C to itself is a nontrivial connected compact topological group.
These conditions all define a circle in the plane. If we're asked "Why does a locus of equidistant points have constant curvature?", we can answer with a proof. But if we're asked, "Is a circle constant curvature because it maximizes area, or does it maximize area because it has constant curvature?", we might have nothing to say. If we ask "Why does a wheel roll?", should we be satisfied with the explanation: "To roll, it's necessary that it admits a nontrivial connected compact group of isometries; it admits these isometries because it has constant curvature; it has constant curvature because the points on its rim are equidistant from the axle."? What if instead the explanation ended with "...because its rim bounds the maximum area boundable by any rim of the same length."? What if we meant to ask why the rolling keeps the axle level, not why the rolling is a continuous motion?
In some cases, deriving the presence of traits in a species from other present traits might be like this: the traits all imply each other, so the implications can be understood, but none of the traits underlie the other traits. This situation is not so implausible. For one thing, features of species that we name might be related to each other less like two objects relate to each other (causal dynamics, interaction) and more like logical properties (implication, equivalence). E.g. the feature "totally consumes its prey" pretty much logically implies "causes its prey to die". For another thing, many systems (especially biological and mental ones) have lots of feedback dynamics, autocausality, and amphicausality, so it's common for features to bootstrap each other, with no clear ordering to the zoomedout causality. The wind pushing on the raised part of the wave causes the wave to become more raised, which causes the wind to catch on the wave more forcefully. Is the wave tall because it catches the wind, or does it catch the wind because it's tall? Both. The parasite evolves chemical defenses specialized to the immune system of one of its host species; that makes it more advantageous to live in that species rather than another; so the parasite evolves to swim to the layer of the water where that host lives; so the parasite lives in that host more frequently; so there's greater pressure to evolve antiimmune defenses to combat that host's immune system. The parasite swims to the top of the lake because it produces a protein with suchandsuch effect, and it produces that protein because it swims to the top of the lake. (See The Lion and the Worm.)
This is not the end of the story, though. To look at our examples, it seems to me that 3×4=12 is somehow importantly prior to 12/4=3, though I don't know how, let me know if you do.
For the circle example, something interesting happens when we widen our view. In the flat, 2d plain, there's nothing to say about which conditions hold of a curve "due to" other conditions; they just coincide. If we look at a general manifold, though, these properties come apart. There are curves of constant curvature that don't satisfy the other conditions; for example, think of a curve drawn with constant curvature starting near the corner of a cube:
It's clearly not continuously symmetric, area maximizing, or an equidistant locus. An equator of a torus is fixed by rotations of the torus; but the equator is not an equidistant locus, and it doesn't even really bound an area at all. A maximal area curve need not be an equidistant locus; think of two thin tall towers standing near each other, and consider a circle on the ground that just goes around the two bases (distance is intrinsic distance on the surface, not distance in 3space). There is structure, though: symmetry implies constant curvature. Maximal area sort of implies constant curvature, with caveats, see The Isoperimetric Problem on Surfaces. On the surface of a geometric sphere, the four conditions are again equivalent. On the surface of a cone, there are some maximal area curves that are symmetric and equidistant, some that are equidistant but not symmetric. I think on any manifold with certain topologies, e.g. the sphere or plane but not the torus, symmetric implies equidistant; as long as there's a point fixed by the action of the symmetry group, the curve is (contained in?) an equidistant locus with center a point fixed by the group. (These relationships may have been implicit in the proofs of equivalence for the plane, but I don't know how to make that clear given logical equivalence, whereas the relationships for more general manifolds are somewhat more clear.)
In the circle example, by looking at a wider range of cases, structure is lighted up where before there was just blank equivalence. Lighting up undistinguished structure also happens with causality: some joint distributions over variables assumed to be somehow causally related can have ambiguous causal structure, which is then disambiguated by adding more variables. Intervening to throw water on the sidewalk, inserting yourself as a new causal factor, disambiguates the theory that Rain causes Wet Sidewalk causes Slippage, from the theory that Slippage causes Wet Sidewalk causes Rain. The [Rain>Wet>Slip] theory says that if you throw water on the sidewalk, it won't be more likely than usual to Rain, but that Slippage will likely happen, whereas the [Slip>Wet>Rain] theory says the opposite. Related Sequences post
Revealing structure by considering more possibilities also happens in reverse mathematics. By working over a logical theory that makes few assumptions——i.e. by "unassuming" axioms, expanding the set of models under consideration, compared to logical theories that everyday mathematics takes for granted——we can see what subset of axioms are really needed to prove a given theorem of ordinary mathematics. Then we can formally show statements like "Gödel's completeness thoerem is equivalent to the Jordan curve theorem, but the BolzanoWeierstrass theorem is strictly stronger than those two", where this statement would have been meaningless if we were working in a standard strong set theory, as all those statements are logical consequences of the assumptions of everyday mathematics.
In the case of the traits of an organism, there may be further questions we can ask and answer if we look at species nearby spatially, or in niche space, or phylogenetically, or if we look back in time. Why did this worm evolve that way, and not some other organism from some distant clade? Why didn't more worms from the same clade evolve this way? Why could this worm evolve this way, instead of being stuck behind an activation barrier?
Why don't the little waves in between the big waves get bigger themselves? Would waves form if perfectly uniform wind hovered over the perfectly flat face of the waters? What would break the symmetry?
All this is to say: just because two things are "logically equivalent", logically imply each other, doesn't mean they don't have different meanings. If we can understand the assumptions that make two statements logically equivalent, we might be able to drop some of those assumptions, and find that the two statements are no longer equivalent, having been given meaning in this new context by some unexplained but hopefully not completely arbitrary process. (Dropping assumptions may not be easy or unambiguous; how do you drop the assumption that "A and B" is equivalent to "B and A"? Possible but weird.) Just because two theories make the same predictions, doesn't mean they're equivalent as mental objects: you could mangle the Aristotelian theory of motion until it makes the same predictions as the Galilean theory about ordinary objects falling, flying, pulling, colliding, and so on, but when it comes time to understand electromagnetism, you'd have a harder time adapting your ideas to that purpose than the Galilean would. Just because two features causally reinforce each other, so that locally speaking each feature causes and implies and explains the other, doesn't mean they're the same feature, and doesn't mean their correlation won't be violated in some other context, and doesn't mean that neither feature came first.