So neither solution is as good as native for motion clarity, but dlss is much closer to native though not quite there
Native exhibits ghosting in some scenes where as dlss/FSR 2 does not. So it seems to very much depend on the scene and kind of movement, object etc. Overall, dlss looks to be better for motion if looking at "everything" i.e. temporal stability, lack of ghosting (overall if going by DF comments) etc.
At the very end of the video he showed another weird ghosting issue with DLSS too that looks very similar to that FSR example posted earlier. Makes me think it might be an issue unique to this game rather than tied to either tech.
I think it is just down to different implementations doing things slightly different to achieve certain goals i.e. if AMD were to remove all the sharpening, you would probably get closer to dlss/natives temporal stability but then obviously you would lose the extra clarity and so on.
The more frames you use, the more accurate the reconstruction will be, but the worse ghosting will become, and vice versa (less frames, less accuracy, better ghosting). The more you try to mitigate ghosting, the less accuracy you get during disocclusion events (moments where it basically discards previous frames and starts fresh, if it detects that it's ghosting too much).
Upscaling is just an extremely difficult problem to acceptably solve, and is actually an impossible to perfectly solve. If you're upscaling from a 0.5x resolution source, you're effectively turning 1 input pixel into 4 output pixels. No matter which way you go about doing this, there will always be certain assumptions that are made by the algorithms, that you need to correct for to produce accurate results, and these corrections will always produce unfortunate side effects then you then need to correct for again.
If you're using a spatial upscaler (ie FSR 1.0, RSR, NIS, or the default bilinear filter that the driver uses), then you're looking at other neighbouring input pixels to figure out how the image changes in the nearby area, to estimate what the output pixels would be. This estimation will always be inaccurate since you're basing it off a low resolution neighbourhood, which leads to blurry results, so you need to correct for that by sharpening it, but this sharpening can easily lead to the image looking deep fried, so you need to carefully balance it.
If you're using a temporal upscaler (ie FSR 2.0, DLSS, XeSS, TAA, TAAU, TSR, etc), then you're moving the world in slight but meaningful increments each frame to change the contents of each pixel over time, and you're using past frames to look at how the 1 input pixel changes over time, to estimate what the output pixels would be. This estimation assumes that pixels only change over time due to the slight-but-meaningful movement, but this is obviously not the case as objects can move on their own, lighting conditions can change, an object can move in front of another object and block it, etc, so you need to correct for that by feeding that information into an algorithm that can detect how much the pixel has changed and correct for it (either an analytical algorithm like how FSR 2.0, TAA, TAAU or TSR work, or an AI-based algorithm like how DLSS or XeSS work), but this algorithm isn't perfect and won't catch changes in every situation (ie residual ghosting even with the algorithm), and it's own correction may produce unwanted side effects (ie the weird black artifacts with FSR 2.0), so you need to carefully adjust the algorithm accordingly.